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ABSTRACT 

The Maryland Refutation Proof Procedure System 
(MRPPS) is an interactive experimental system intended for studying 
deductive search methods. Although the work is oriented towards 
question-answering^ MRPPS provides a general problem solving 
capability. There are three major components within MRPPS. These are: 
(1) an inference system, (2) a search strategy and (3) a base clause 
selection strategy. The "inference system" is based on the resolution 
principle and performs the logical deductions specified. The user may 
select from a wide variety of refinements of resolution. The "search 
strategy" directs the deductions to be made by selecting from clauses 
already generated those that have a best merit. The "base clause 
selection strategy" determines which facts and general axioms to 
select from the data base. Such a clause may be selected regardless 
of whether or not it has the best merit. Heuristic techniques are 
applied within each of the three major components. This technical 
report describes the current implementation of MRPPS. It describes 
each of the components and how they are integrated into what has been 
termed the "Q" algorithm. (Author/NH) 
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ABSTRACT 



The Maryland Refutation Proof Procedure Systems (MRPPS) is an . 
interacf*^ perimental system intended for studying deductive search 
methods. Although the work is oriented towards question - answering, 
MRPPS provides a general problon solving capability. 

There are three major components within MRPPS. These are: 

(1) an inference system, 
% (2) a search strategy, and 

(3) a base clause selection strategy. 
The inference eystem is based on the resolution principle and performs the 
logical deductions specified. The user may select from a wide variety of 
refinements of resolution. Current refinements are: set of support, linear 
PI, SL, input, and combinations of the above. Paramodulation and deletion 
by tautologies and subsumption are also provided with the system. 

The search strategy directs the deductions to be made by selecting 
from clauses already generated those that have, the best merit. The merit 
of a clause is given by f(n) = w^gCn) + w^^h^^Cn) + ^J^Z^^^ + + ^k'^k^^^-^* 
If the user can specify tie-breaking rules for equal values of clause merit, 
an upwards diagonal search results in the sense of Xowalski. The upwards 
diagonal search included in MRPPS generalizes the Kowalski upwards diagonal 
search to an n-dimensional search. 

The base clause selection strategy determines which facts and general 
axioms to select from the data base. Such a clause may be selected 
regardless of whether or not it has the best merit. 

Heuristic techniques are applied within each of the three major 
components. This technical report describes the current implementation of 
MRPPS. It describes each of the con^Donents and bow they are integrated 



into what has been temied the Q* algorithm. 

MRPPS is written in FORTRAN V for the UNIVAC 1108 (a version o£ 
RDRTRAN IV) and runs under EXEC 8 at the University of Maryland. The 
current iinp^^mentation is core bound and requires approximately 60K 
words of memory to run, of which 35K is for the data base and for 
working storage. 
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1. Introduction 

In this interim report we describe research being performed at the 
University o£ Maryland in the area o£ Question- Answering (QA) Systems. 
The major goal o£ the research is to atteiT5)t to provide some insight in- 
to the problm o£ performing deductive searches in a QA System. We 
therefore do not emphasize the natural language aspects of question- 
answering. The approach we have taken is to design and implonent a QA 
System which provides the flexibility to experiment with various search 
heuristics as well as with different inference systems. The purpose of 
this technical report is to describe such a system termed the Maryland 
Refutation F^oof Procedure Systems (MRPPS) . 

The deductive capability of MRPPS is provided by the resolution 
principle as developed by Robinson [1965a] . Many refinements of resolu- 
tion are provided within MRPPS as described in this report. MRPPS 
consists of not only inference mechanisms but also a search strategy 
that directs the search for an answer to a query. Such a system has been 
termed a proof proaedure system by Kowalski [1970b] . A refutation proof 
procedure is a term which applies to proof procedures in which the proof 
consists of a refutation of the negation of a statement to be proved. 
A refutation proof procedure system may therefore be defined as a syston 
P(I,E) that consists of two basic parts: 

(1) a set of inference rules I based upon the resolution principle and 

(2) a search strategy ^ for I. 

The inference rules restrict the application of resolution to particular 
subsets of S , the set of clauses that could potentially enter into the 
proof. Examples of these rules are set-of- support and Pl-deduction (as 
well as unrestricted binary resolution with factoring) . On the other 
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hand, the search strategy determines which two clauses £ran S should 
be chosen to resolve next as determined by some cost function £ applied 
to each clause. IVhether or not the two clauses chosen will actually be 
resolved is detemined by the inference system in force, which ''filters 
out" non-permissible inferences. We have also added to MRPPS a third 
major component in addition to the inference rules and search strategy: 
(3) a selection strategy for determining the sequence and timing 

in which data entries and general rules are to be brought to 

bear on the problem. 
Tlie second of the above system components will be called the 
deduction strategy while the third component will be called the hase clause 
selection strategy. These two components, together, comprise a search 
algorithm which we call the search algorithm Tlie design of this 
algorithm has been strongly influenced by the ^ * algoritlim of Kowalski - 
[1970b]- The deduction strategy builds upon and extends analogous parts 
of the \ * algorithm to permit additional heuristic considerations to be 
employed. The base clause selection strategy, which has no analog in 
the )^ algorithm, permits the selective use of data and axioms, and also 
permits heuristic and semantic considerations to be brought to bear on 
tlie problem. Use of the selection strategy leads to a rapid and 
efficient derivation of a refutation in many instances. 

The inference component, of the system is based on the first-order 
predicate calculus, and in particular, on the Robinson resolution prin- 
ciple (Robinson [1965a]) and refinements thereof. This component 
permits the selection of one of several inference systems, or of combina- 
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tions o£ these inference systems, for any given run of MRPPS. 

Details of each of these three main system components are described 
in this report. The major iimovations incorporated in MRPPS are 
listed below: 

(1) MRPPS utilizes a refutation proof procedure that incliades 
both heuristic search and logical deduction. In the past, 
most systems have used ad hoc approaches in either or both 
of these processes whereas ^IRPPS integrates and coordinates 
the two processes based on Kowalski^s work. 

(2) The concept of upwards diagonal search has been extended 
to the case where the heuristic component is a linear 
combination of an arbitrary number of heuristics. 

(3) The concept of merit sets developed by Kcwalski has been 
modified. IVhereas Kcwalski prijnarily treated the case of 
clause length and level as heuristic measures, MRPPS allows 
various additional heuristics to be defined. 

(4) The concept of using cluster analysis as a basis for 
deriving heuristic measures lias been introduced. 

(5) A selection strategy is used for introducing base clauses 
into the search space. This strategy delays clauses from 
being used to generate inferences until it is essential that 
the clause be brought in. The strategy is such that base 
clauses are not given merit unless they are determined to 
be relevant to the search in progress. 

Work on the effort began approximately in February 1972. The above 
three parts of the system were designed and iinplemented between the start- 
id 
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ing date and August 1972. The system is written in FORTRAN V for the 
UNIVAC 1108 (a version of FORTRAN IV) and runs under EXFC 8 at the 
University of Maryland^"^^ The system can be used from teletype and 
has interactive capabilities that permit the user to select a wide 
range of options to run his problem. Primitive capabilities exist to 
enter, update, and maintain new data bases. The routines to handle the 
data base are described in an appendix to this report The entire 
system is core .bound and requires approximately 60K uords of memory to 
run, of which 25K is for programs and 35K is for data storage, including 
the entire data base and working storage. 

A data base is currently being used that consists of genealogical 
information about Eskijnos. It is expected :hat experimentation witli 
this data base and others will be performed to gain insight into infer- 
ence mechanisms, heuristics, and semantics that might be useful for QA 
systems. 



FORTRAN was chosen primarily because of availability and a high de- 
gree of maintenance at the computer center. 

This appendix is not included with this report,. but may be obtained 
upon request from the authors. 



2. Background 

A QA System has been characterized by I&ino [1967] as consisting 

o£: 

(1) a source language input; 

(2) a syntactic analyzer; 

(3) a semantic analyzer; 

(4) an inference and search mechanism; and, 

(5) an output language. 

There have been a large number of research efforts devoted tovards QA 
Systems as evidenced by state-of-the-art summaries by Simmons [1965, 
1970], Minker and Sable [1970], f4ontgomery [1969, 1972], Sal ton [1968], 
Bobrow, Fraser, and Quillian [1967]. 

The developments reported upon in the above survey articles have 
priiTiarily emphasized the natural language processing portion represented 
by the source language, syntactic analysis, and semantic analysis. The 
major purpose of this effort is to investigate the inference and search 
mechanisms of such systans. If an inference mechanism is important to 
an application area, one can always simplify the source language input 
by placing the burden of the work on the user and hence, avoid machine 
problems in syntactic and semantic analysis. However, if the inference 
mechanism is weak in its deductive capabilities, or cumbersome, p QA 
System may not be a viable tool. 

A number of QA Systems have been designed that contain a deductive 
capability. There have been three major areas of development which may 
be termed 

(1) the predicate calculus approach; 

(2) the relational system approach; and. 
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(3) the procedural language approach. 

In the predicate calculus approach data is represented by instan- 
tiated predicate expressions such as I (I Sara) and the rules 
used to define predicates and their interrelationships are in the form 
of general axioms such as FATHER (x, y) A FATHER (y, z) => GRANDFATHER (x, z 
Pioneering work in this approach was first performed' by Darlington 
[1962] who applied results by Davis and Putnam [1960] to perform deduc- 
tive searches. Green and Raphael [1968] were the first to employ the 
Robinson Resolution Principle [1965a] in QA Systems. Darlington [1969] 
continued his efforts in QA Systems and has speculated that an inference 
system based on A-ordering, Pl-deduction and renaming, and set-o£- 
support should be used for QA Systr^ms. Darlington [1971] is continuing 
liis work and is developing a system based on second order logic. Coles 
[1969] has performed some studies on QA Systems, but his results are in- 
conclusive. Thus, the work has emphasized the 'logical'* aspects of in- 
ference making in contrast to the heuristic aspects, and little experi- 
mentation has been performed in each case. 

In the relational data system approach, a set of subroutines exist 
that may be linked together to derive other relations. The subroutines 
handle some general cases such as relational "composition'*. Two rela- 
tions may be composed when the relation Pj^(x,y) and the relation 
P2(y,z) can yield a third relation P3(x,z) . A number of QA Systems 
have been designed using such general principles. A system termed 
Relational Store Structure (RSS) , developed by Marill [1967] , contains 
some general rules of this type and can, for n-ary relations, perform 
deductive searches. A breadth-first search is employed to derive infer- 
ences. Another system with a similar approach, but which only handles 



Mnpr>' relations was developed by Ash and Sibley [1968] • Feldman 
ai.'.i kovTier [1968] have also developed a system for performing 
inferences with binary relations. 

In the procedural language approach, a language is available in 
which the user can write procedural statements that permit deductive 
searches. A system designed by Levien and Maron [1965] provides a pro- 
cedural language, INFEREX, for specifying queries. Hewitt [1970] has 
developed PLANNER, a procedural language for problem solving 
PLANNER contains an automatic backtracking mechanism. 0A4, a system 
similar in design to PLANNER has been developed by Rulif son [1971] . 
McDermott and Sussman [1972] have developed a procedural language called 
CONNIVER that 3.s similar to PLANNER, but requires no backtracking. Also, 
Feldman et al [1972] have developed SAIL which is an extension of the 
LEAP language and has capabilities similar to CONNIVER. 

Except for the recent developments of PLANNER, QA4, OT1NRT.R, and SAIL, 
the approaches taken to date have not used heuristic techniques exten- 
sively to enhance the search for a solution and decrease the search time. 
There has, in general, been little work in applying heuristic techniques 
to theorem proving or to QA Systems in general. Norton [1972] has per- 
formed some experiments with a theorem prover that includes paramodula- 
tion. A major part of this study is the development of techniques to 
peimit heuristics to be added to deductive searches so as to help guide 
the search. We are not convinced that the sinple addition of heuristics 
is sufficient. Foctensive semantic information will also have to be used. 

In his thesis, Kowalski [1970a] developed the concepts of a refu- 
tation proof procedure system and of a search strategy that can be eraployed 
in a theorem proving environment • The work by Kowalski generalizes 
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the results developed by Hart, Nilsson and Raphael [1968] for state- 
space problems to theorem proving situations. Assuming that meaningful 
heuristics can be developed, Kowalski shows how to apply the results to 
theorem proving and has placed the results on a firm theoretical basis. 
Meltzer [1971] provides a very lucid elaboration of Kowalski *s work. 

Initial work convinced us that a proof procedure system alone will 
not be adequate for QA Systems. Sme strategies must be used to decrease 
the number of clauses that are to be passed to the QA System. Semantic 
information about the particular domain, and the general rules used to 
deduce new results must be brought to bear on the problem. This report 
describes some of our considerations to* date. . As more experience is 
gained, we expect that additional sanantic considerations will be de- 
veloped. Indeed, at the time of this report, we are investigating other 
considerations. As this report was being written, we learned of inde- 
pendent work by Travis, Kellogg, and Klahr [1972] who, although taking a 
somewhat different approach than described in this report, have similar, 
if not identical ideas in this particular area. The work by Travis et 
al does not use theorem- proving techniques. MRPPS is sijnilar to the work 
by Allen and Luckham [1970] who have developed an interactive theorem- 
proving system. 
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3. Current Search Strategies for Theorem Proving 

3.1 Ordered Search Algorithms 

In order to more clearly understand the operation of the Maryland 
Refutation Proof Procedure System described in the following sections, 
the basic concepts underlying current search strategies that may be used 
for theorem proving will be discussed. Many of these searching algori- 
thms have defined a costing function f that is used to evaluate the 
relative merit (i.e., cost) of all those nodes of the search space that 
are available for expansion. At each step of the algorithm, the node 
with the smallest cost is expanded next and its successors are placed 
back on a list containing all unexpanded nodes. In addition, each time 
a node is expanded, it -is removed from this list. The nodes on this 
lis^ are then reordered with respect to f , the node with the smallest 
f value is chosen for expansion, and the process is repeated until a 
solution is found (or the list is empty). This is essentially the pro- 
cedure used in the A* algorithm of Hart, Nilsson, and Raphael [1968]. 

It has been customary to define the above merit function as 
f(n) = gfn) h(n) where g(n) is an estimate of the accual cost of a 
minimal cost path from a start node of the search space to the node n , 
and h(n) is a heuristic estimate of the actual cost from node n to 
a solution node. For the problem domain of theorem proving using re- 
solution, the following analogies may be established between a state- 
space search (as in A*) and the search for a null clause; 

(1) states (nodes) ^ clauses 

(2) starting states ^ data base clauses, clauses from 

the general axioms and clauses 
from the negation of the question 
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or theorem 

(3) operators ^ binary resolution and factoring 

(4) goal state ^ null clause ( □ ) 

IVhen using an ordered search algorithm for theorem proving such as 
that described above, it is necessary to order all the clauses in the 
data base in the sequence of best merit first, to avoid having to search 
the entire data base for the clause with best merit. This means that in 
setting up a data base, one must specify the merit for all the data 
base entries and axioms and then sequence them in merit order. IVhenever 
the user wishes to alter the definition of f(n) , the data base must be 
reordered again, A great deal of work is therefore required at the time 
the set S is defined to the system. In addition to the large amount 
of work entailed in evaluating f(n) for each clause in advance, the 
above method has a further disadvantage of requiring the value of the 
heuristic function h(n) to be fixed in advance of a proof. IVhat would 
be more desirable would be to avoid calculating f (n) prior to a proof 
and to permit h(n) to be based upon the query inputted to the systern 
at run- time. We shall see that this will be possible to achieve. 

There are two more limitations of the A* algorithm when it is used 
for theorem proving. First, it must be modified to allow the applica- 
tion of successor-generating operators requiring more than one input 
(e.g., resolution). This means that all of the successors for a clause 
cannot be formed at one time since that clause may later resolve with 
another clause that has not even been generated yet. Second, at any 
stage of a proof, every clause in the entire search space is potentially 
available for interaction with the clause being expanded (by 
resolution or paramodulation) . Of course various inference systems 
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could restrict the number o£ interactions, but experience with many 
theorem provers has shown that in general, space would still be exhaus- 
ted before a solution would be found. What is needed is some technique 
for being more selective about which clauses are allowed to interact so 
that inferences are generated in a more optimal order than in A*. For 
instance, whereas the A* algorithm would find all successors for a 
clause C at once, an inproved strategy might expand a successor of C 
before finding the rest of C's successors. if it decided that the coui-se 
of action might be more productive than the latter. TTie I* algorithm 
■of Kowalski [1970b] , described in the next section, attempts to alleviate 
these last two problems. 

3.2 Improving the Ordered Search Algorithm - 'iTie Algorithm 

The I* algorithm employs a cost function f (n) = g(n) + h(n) to 
measure the merit of a clause C(n). In the ensuing discussion, g(n) 
will be defined as the level of the clause .C(n) , and h(n) as its 
length. Kowalski defines both a diagonal search strategy using a dia- 
gonal merit ordering and also an upwards diagonal strategy using 
an upwards diagonal merit Let n and n' be two nodes of our 

search space. Then 

(1) n n' (n has better or equal diagonal merit than (3.1) 

n») iff fCn) ^ f(n') 

(2) n ^jjU n' (n has better or equal upwards diagonal merit 
than n') iff f(n) ^ f (n') and h(n) ^ h(n') whenever 
f(n) = f(n') . 

It should be noted that (1) defines the diagonal search which is very 
similar to the A* algorithm mentioned previously. For an upwards dia- 
gonal search, if two clauses C(n) and C(n') have equal f values, 
we then expand that clause which is the shorter of the two since this 
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action has the greatest possibility o£ producing shorter clauses (per- 
haps the null clause). 

As clauses are generated during the search (by inclusion from the 
base set S , by resolution, or by factoring), they are placed in dis- 
joint sets called A sets (actually implemented as lists). Thus a 
clause C(n) is stored in set A(i,j) if it has level j and length 
i . This is different than the A* strategy where clauses are essentia- 
lly separated into disjoint subsets according to f value rather than 
by both g and h values separately. 

I attempts to generate clauses approximately in upwards Miagonal 
merit order. That is, the strategy would try to generate clauses for 
set A(i,j) before those in A(i*,j*) if: 

(1) i + j < i' + j* or (3.2) 

(2) i + j = i' + j» and i < i' 

If I is generating clauses for the set A(i,j) it may do so in the 
following ways: 

(1) if j = 0 , then a clause from the data base of length i 
may be placed in A(i,j) ; 

(2) if j > 0 , a clause in A(i + 1 , j - 1) may be factored; or 

(3) if j > 0 , a clause C(nj^) in A(ij^,jj^) may be resolved 
against C(n2) in 4(12^2) ^^^^^ 

i = ij + i2 - 2 

j = max(jj^,j2) + 1 . 

Note that at the start of a proof, all of the A- sets are empty 
(unlike A*). They are filled by a routine called FILL(i,j) that 
generates either by (1) or (3) above, clauses of merit (i,j) , Note 
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that during FILL(i,j) , all parent clauses of resolvents formed by (3) 
are clauses previously generated and placed in A- sets of better merit 
than (i,j) • l^en no more clauses can be formed in this manner, 
FILL(i',j') is called where (i',j') is the "next" merit after (i,j) 
with respect to the ordering ^^^u . FILL (0,0) is called when I begins 

A second subroutine, RECURSE(C(n)) , is called whenever FILL(i,j) 
generates a clause C(n) . Because C(n) is a newly formed clause, it 
may very well interact with other clauses previously generated to pro- 
duce a new inference. Thus, RECURSE (C (n) ) generates by (2) or (3) 
above, all clauses of merit (i',j') <^u (i,j) that are imme'diate de- 
scendents of C(n) . Upon generation of the successor C(n*) , 
RECURSE(C(n^)) is called to form iirmiediate successors of clause C(n') 
This recursive process continues until some level of RECURSE (C(n^)) 
failt. to generate any clauses meeting the above merit constraints. 
(Note that this does not mean that C(n^) has no successors at all.) 
Control is then returned to the previous level of RECURSE which contin- 
ues to find the ''next'' successor for the clause C(n^^j^) local to its 
level of recursion. Eventually FILL is .re-entered, and the FILL-ing 
and RECURSE- ing process continues until the null clause is found (or 
time or space bounds are exceeded) . 

it 

Although the I algorithm is theoretically very elegant, it is 
bound to fail for any practical application where the data base is 
fairly large (several hundred clauses perhaps). This is because typi- 
cal question-answering data bases consist primarily of unit clauses, 
(i.e., clauses of merit (1,0)). Thus, when FILL(1,0) is called, all 
of these units will be placed in A(1,0) before FILL(0,2) is called. 
As in the A* algorithm, we have again deluged our theorem prover with 
data clauses, most of which are totally irrelevant to the query. 
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3.3 An Example Using the Algorithm 

As an exmaple of the kind of trouble one can get into, let the data 
base S consist of |S| clauses, |S| - 1 of which are units. Among 
the units are the clauses F(Jack, Sally) and F(Harry,- Jack) ' where 
F(x,y) may be interpreted as "x is the father of y". The single non- 
unit (in clause form) is -F(x,y) V -F(y,z) V G(x,z) which states: 
''if X is the father of y and y is the father of z, then x is the grand- 
father of z." 

Now suppose that the negation of the query is -(3x) G(x, Sally) . 
In clause form this is -G(x, Sally) . We assume that there is no 
data base clause of the fom G(A, Sally) for some constant A , such 
that an inmediate contradiction would be possible. Also assume that in 
the implementation of all si:5)ported clauses are stored separately 
from the non-supported clauses of the same merit so that no explicit 
tests need be made to determine whether a clause has support or not, 

IVhen FILL (1,0) is called, each unit base clause is brought in 
one by one. As each clause C is generated, RECURSE is called with C 
as its argument. Since C cannot be factored (we assume a fully factor- 
ed data base), C is resolved against all clauses in A(1,0) that have 
already been generated and that have support. Since only -G(x, Sally) has 
support, no resolvents are fomed until this clause is brought in* 
IVhen it is, it is resolved against every clause in A(1,0) . We fail 
in each case since no direct contradiction exists. We thus continue to 
fill A(1,0) until we have brought in all |S| unit clauses and have 
attempted |S| resolution operations. Note that this figure includes 
the theorem that is also of merit (1,0) ♦ No more clauses are genei^ated 
at all until FILL (3,0) is called and ^* then brings in the general 
axiom -F(x,y) V -F(y,z) V G(x,z) . RECURSE is called and -F(x, Sally) 
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imfnediately resolves with the axiom to yield -F(x,y) V -F(y, Sally) o£ 
merit (2,1). Thus, |S| +1 resolution operations have been attempted 
so far. 

Now when REQJRSE is called, -F (Sally, Sally) is formed as a factor 
of merit (1,2). Let us assume that no other clause successfully re- 
solves with -F(Sally, Sally) when RECURSE is called. However, tests 
are made against all clauses in A(1,0) as well as -F (Sally, Sally) 
itself. This is allowed since a resolvent of c^ 6 A(1,0) and C2 € A( 
1,2) has merit (0,3) ^^u (3,0); the same applies to A(l,2) . Thus 
|S| + 1 tests are made. Next, all unit base clauses are resolved 
against the clause in A(2,l) . This is permissible since a clause in 
A(1,0) resolves with a clause of A(2,l) to give a clause of merit 
(1,2) ^^u (3,0). Recall that we are filling A(3,0) . In the worst 
case F (Harry, Jack) and F(Jack, Sally) might be the next to last 
and last clauses on the A (1,0) list. Thus |S| - 1 resolutions would 
be attempted before -F(Ja^.k, Sally) would be placed in A (1,2) . When 
this clause is RECURSE-d on, another |S| steps are taken before □ 
is found and placed in A(0,3) . Thus a total of 4|S| 1 resolution 
steps were needed to answer a very simple question! These figures 
ignore the possibility that -F(x,y) V -F(y, Sally) might have resolved 
with other units before resolving with F (Harry, Jack) . Thus for large 
|S| , a proof might never have been found. Thus it seems that a search 
strategy is not practical for question-answering systems if it uses only 
lengtii and level as components of f and employs no strategy for selec- 
ting semantically appropriate clauses from the data base. 
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4, Maryland Refutation Proof Procedure System (MRPPS) 

4,1 Introduction to MRPPS 

4. 1,1 Overview of the System 

The development of a new refutation proof procedure system has been 
/motivated by the limitations of search strategies such as I and 
A* , jDost of which have been mentioned in the previous section. These 
limitations are as follows: 

(1) the data base must be reordered whenever the definition of 
f(n) is altered so that the clause of best merit may be 
easily located; 

(2) the merit of base clauses must be calculated prior to a 
proof ; 

(3) clauses are selected from the data base without reference to 
any semantic information; and, 

(4) the merit function f(n) = g(n) + h(n) is composed of only 
two parameters where g(n) and h(n) may be composite func- 
tions but are considered as single values. 

It should be clear that the first three limitations must be corrected 
in order to successfully ans\>fer questions about a large data base. 
However, at the present time, it is not knoivn whether the last point is 
really a limitation since composite g and h functions may permit as 
much discrimination between clauses as does an evaluation function f 
with several separate components • This question should be answered 
through further experimentation. 

As described in the introduction, ^>IRPPS consists of three main 
components: a set of inference rules, a deduction strategy, and a 
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base clause selection strategy. The last two together comprise the Q* 
search algorithm that controls the search for a refutation. Figure 1 
gives an overview of the system and shows how the three main components are 
integrated into the system. Note that solid lines represent con- 
trol paths whereas dotted lines represent data paths between components 
of the system. MRPPS is controlled by an executive that communicates 
with the user as well as with the deduction strategy routines and data 
base creation and maintenance routines. There is also a freest ore 
maintenance routine that handles the allocation and return of core 
storage . 

The data base of MRPPS may be separated into two different data 
structures at the user^s option. The first form indexes clauses by 
predicate sign, then by each predicate name of the clause, and within 
equal predicate names, by clause length. The second form has one additional 
level of indexing; namely, within predicate names, clauses are indexed 
by term najiies, and further indexed by clause length if the term names 
are identical. Thus, a greater degree of discrimination is permitted in 
the second form than the first. It is advisable that the user enter 
an axiom into the first structure only if it contains no functions or 
constants. In this way, the base clause selection strategy will select 
clauses in as optimal a fashion as possible. 

There are several in^lementation details that should be mentioned 
here. As noted previously, MRPPS is implemented on the UNI VAC 1108. 
Since addresses on the 1108 are 16 bits long, only 65K words of memory 
may be accessed directly. However, up to 262K words may be referenced 
by indirect addressing. Since 65K is not a realistic limitation for an 
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experimental systan, all data referencing is therefore done indirectly, 
even though a loss of speed is suffered. 

Ill addition, the current version of the system is core-based. For 
large data bases, however, this is out of the question, and thus it is 
planned that future versions of MRPPS will have the capability of 
accessing data on peripheral devices rather than only in core. 

MRPPS has been designed to be a flexible as well as a computation- 
ally efficient system. We have attanpted to provide a wide variety of 
inference mechanisms as well as various heuristic measures for use 
during the search for refutations. Details of the inference routines will 
be discussed in section 4.1.2, the deduction strategy will be described in 
section 4.5 and the base clause selection strategy will be described in 
section 4.4. Where appropriate, references are given for the various 
techniques that have been used in MRPPS. 
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Figure 1, System Overview of MRPPS 



20 



4.1.2 Inference Systems Implemented 

MRPPS allows the user considerable flexibility in choosing which 
inference mechanisms (or combinations of mechanisms) to use. The follow- 
ing inference mechanisms are currently implemented. With each infer- 
ence meclianism we provide a reference to the relevant literature. 

(1) Binary Resolution and Factoring (Robinson [1965a]) 
Unrestricted resolution allows resolution between any two 

clauses 6 S and C2 € S on two literals, € q and 
JI2 € C2 , where S is the set of clauses available for consider- 
ation by resolution at some stage of a search. S contains all 
factors and resolvents generated during the search and those base 
clauses made available by the base clause selection strategy. 

(2) Set-of -Support O^os, et al [1965]) 

In this inference system, the set S is subdivided into two 
subsets K and S-K such that S-K is satisfiable. Clauses in 
K are said to have support and thus initially, K consists of all 
clauses from the negation of the theorem, A resolvent R(Cj^,C2) 
is permitted if and only if 6 K and C2 € S , and each resol- 
vent is placed in K when it is formed. Set-of -support is impor- 
tant since it restricts the set of different potential proofs. 

(3) PI -Deduction (Robinson [1965b], Meltzer [1966]) 

A resolvent R(q,C2) is permitted if and only if either 
or C2 is a positive clause. 

(4) Linear Resolution [Luckham [1968], Loveland [1968]) 

A linear derivation from a set of base clauses S is a 
sequence of clauses C^,C2,..,,C^ such that € S and each 
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C^^^ is a resolvent o£ an^ B where either 

a) B is a base clause 

b) B is an ancestor of . 

(5) Input Resolution (Chang [1970]) 

This is the same as linear resolution except that B may only 
be a base clause. Input resolution is not complete. 

(6) Linear Resolution With Selection Function (Kowalski and 
Kuehner [1972]) 

SL-resolution is a refinement of linear resolution and is 
very similar to model elimination (Loveland [1969]). The funda- 
mental difference between linear and SL-resolution is that in the 
latter, a single literal is selected from each clause in the 
linear derivation. IVhen input resolution is performed on the 
clause C. , only the selected literal may be resolved upon, 
whereas in linear resolution, any literal could be resolved upon. 
This has the effect of eliminating redundant proofs." 

All of the first six inference systems, with the exception of in- 
put resolution are complete and sound. 

(7) Combinations 

NKPPS allows many combinations of the above inference systems 
to be used simultaneously, although in some cases, the resulting 
inference system may not be complete. For instance if set-of- 
support and Pl-deducticn were used concurrently, we could guaran- 
tee completoness only if the conjectured theorem together with all 
positive clauses had support. The user must determine for himself 
whether the combination that he is using is complete. The system 
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does not automatically prevent incomplete combinations of systems 
from being used. 

(8) Paramodulation (Robinson and Wos [1969]) 

Paramodulation is a substitution rule that infers new clauses 
by substituting a tern t^^ for a tem t2 in a clause C such 
that t-|, equals t2 . The resultant clause C* is said to be in- 
ferred by paramodulation. More formally, stppose that there exists 
a predicate E(tj^,t2) that expresses the equality of terms of 
t^ and t2 . Let A and E(t^,t2) V B be two clauses whose 
variables have been standardized apart. Here B is the remainder 
of the clause containing E(t-|^,t2) • Suppose that a term t in A 
and t^ have a most general common instance: 

tjO = ta . 

Let A' be the result of replacing in Aa some single occut*renee 
of ta by t2a . Then the clause A' V Ba is inferred by para- 
modulation. 

Paramodulation is important since it circumvents the problem 
resulting when equality between terms is handled by resolution. 
It is available in MRPPS as an option to the user whereas resolu- 
tion and factoring are always performed. It has been shown by 
Robinson and Wos [1969] that paramodulation is complete when used 
in a functionally reflexive system; i.e., for all functions f , 
= Xj implies f (x^, . . .) = f (Xj , . . .) . 

4.1.3 Deletion Rules Used in MRPPS 

IVhenever a clause is generated by any means during a proof, various 
checks may be made to determine if the clause is redundant. A clause 
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is considered redundant i£ it has already been formed before, or if its 
presence will contribute nothing to the search process. If any of 
these conditions are detected, the clause may be deleted (if so desired 
by the user) . 

MRPPS employs three optional deletion rules: deletion of tautolo- 
gies, of alphabetic variants, and of subsumed clauses. A set S of 
clauses (namely all the clauses in our generated search space) is un- 
satisfiable irrespective of whether or not tautologies are deleted. If 
they were left in the search space, however, they would only tend to 
generate irrelevant inferences. Since detection of tautologous clauses 
is a reasonably efficient process, it is therefore advantageous to al- 
ways perform tautology elimination. Also, completeness is not lost 
for any of the above systems. 

. We may define subsumption as follows: a clause C-|^ subsumes a 
clause C2 if C^^a s C2 for some substitution a . It has been shown 
that the unsatisfiability of a set of clauses S is not affected if a 
clause Cj, 6 S subsumes a newly formed clause C2 , and C2 is con- 
sequently deleted immediately after it has been formed. This is a 
special case of subsumption that preserves the completeness of the 
search strategy. Difficulties can occur if C2 subsumes C-|, and C-|^ 
is arbitrarily deleted (see Kowalski [1970a]), The process of subsump- 
tion is often costly in terms of computer time since a search of the 
set of generated clauses is necessary every time a clause is formed 
during a proof; furthermore, the subsumption test is itself a time- 
consuming operation. This option must therefore be used with care. 

Since subsumption is relatively time-consuming, as a compromise we - 
employ a third deletion rule for alphabetic variants. Two clauses are : 
said to be alphabetic variants of each other if they are identical up S 
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to a change o£ variable names. Thus, = P(£(a,x),y) is a variant 
o£ C2 = P(£(a,z),x) . It should be clear that C^a s C2 and thus C2 
can be sa£ely removed since it is subsumed by . This rule has the 
advantage that it is conputationally e££icient whereas in general, sub- 
sumption is not. Our experience has indicated that many redundant 
clauses are £ormed in a typical question answering search, and that 
deletion o£ alphabetic variants is advisable. 

4.2 A Generalization o£ P and Upwards Diagonal Search 

4.2.1 A Revised De£inition o£ Diagonal and Upwards Diagonal Merit Order- 

As described in Section 3, when £(n) is defined in terms o£ only 
clause length and level it does not discriminate well enough between those 
clauses that are relevant and those that are not, and it is doubtful 
that any costing function composed on only two clause features would be 
sufficient. It should be possible to define a more effective costing 
function by letting g(n) and h(n) be linear combinations of clause 
features, such that g(n) and h(n) are considered as single values. 
Alternately, f may be redefined to be a linear combination of an 
arbitrary number of functions rather than just one or two such that each 
function retains its value rather than being imbedded in g(n) or h(n). 
For instance, we might like f to be of the form f(n) = WQg(n) + Wj^hj^(n) + 
W2h2(n) , where the w^ are weights. Here g could be clause level, 

could be clause length and h2 could be some form of functional 
complexity. 

We would like to define our evaluation function f to be a func- 
tion of 2m > 0 arguments, Tliat is, v = f (w^,W2,. . . ,w^; £^^£2^* * * 
where v is the value of the function, the w^ are weights and the 
Q £' are feature values for a node in our search space (i.e., a clause). 
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We might then define £ to be a linear discriminant function o£ the 
form 

£(W,F) =Wi£j + W2£2 - Vm 
where W is the row vector o£ weights and F is the column vector o£ 
features. Thus (4.1) can be re-written as a vector product: 

f(W,F) = W-F . (^-2) 
The function f (W,F) defines a point in one- space and the vector W 
defines a point in m-dimensional weight space whereas F defines a 
point in m-dimensional feature space. Since the latter is a Euclidean 

space we shall call it ^ . 

As an extension of this definition, we may redefine the merit 
orderings and ^^n so that we can distinguish between clauses of 
equal f value. This parallels equation 3.1. Let node n have fea- 
ture vector F and node n' have feature vector F' . In addition, 
let W be the weight matrix with off -diagonal elements equal to zero 
and with diagonal elements ^i»^2*""\ ' 

(1) n^^un'iff fOV,F) ^ f(W,F'); (4.3) 

(2) if f CW,F) = f(W,F') then n ^^u n' iff w^f^ ^ w^f{ ; 

(3) if f(W,F) = f(W,F') and w^f^ = w^fj then n ^^u n' iff 

V2 ^ V2 ' 
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(m) if f(W,F) = f(W,F') and for j = l,...,m - 2 , w.f. = w^f? 
then n ^^u n' iff w^^.^f^.^ ^ w^.^f^.^ . 

If in addition, ^.i^j^-l ' %-l^m-l n = n« . Note that the 

9^- case of VLf„ = vrf ' was not stated since i£ the equality 
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condition holds for Steps 1 to m-1 and if w ,f ^ = w ,f • 

m-1 m-1 m-1 m-1 

*®"Vm = Vi!i £(W,F) = fCW,F') . 

Equations 4.1 and 4.3 essentially defijie linear transfomations 
i^on the feature vector F to produce a new vector F . The 
first transformation, , was of the fom t^: and pro- 

duced a new vector F in one-space E by the following matrix transfonna 
tion: 



^2 



= Cw^fj + W2f2 + . 



+ w f ). (4.5) 
m m ^ ' 



m 



The second transfonnation t^^ was of the form x : and fonn- 

ed F in m-space E™ by the following transfonnation: 



w. 



\ o 



m , 











I- 








m m 



(4.6) 



As an example of such a transfonnation, let m = 4 , F(0,1,0,1) and 
the diagonal elements of w = (1,2,1,1). Then 



F = T4(F) = W . F = 
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These two transformations are two members o£ a family o£ linear trans- 

fomations we shall call T . Each € T is defined x^^: 1^ , 

i 

1 i m . Each vector component of F € E - will be a linear combina- 
tion of some subset of the canponents of F € e"^ . Nodes of our search 
space are thus evalimted on the basis of their merit which is repre- 
sented by an i-dimensional vector in E"^ , namely F (not F). 
We must thus work directly with the transformation matrix W 

instead of a vector W as before. It will be convenient to define 

th 

as the vector representing the j — row of W ; similarly, f^ and f ^ 
are the j— elements of F and F respectively. F is then derived 
from F by the following transformation: 



X. (F) = F = W . F = 



^11^12- --^Im 
^2l''22---^2m 



/wi • f\ 

W2 • P 



(4.7) 



il i2 im 



m 



As an example of a linear transformation of F , let t^: E -j^ 
Then F = W • F might be represented by 
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F = 


2^3 


-f4 . 



Keeping this transformation in mind, we may now give a revised de- 
finition for and with respect to the transformation ti: 
^ . This new definition. will be termed order ing-A. Let node n 
have feature vector F and node n' have feature vector F' . W is 
the weight matrix with i rows and m columns. Define 

£0^,F) = I W .F= If (4.8) 
j=l ^ j=l ^ 

That is, £ is the sum of the components o£ F . Note that some other 
measure o£ diagonal merit could have been used. The above £orm was 
chosen because it resembles a linear discriminant function. 

Diagonal and upwards diagonal merit for ordering -A can thus be de- 
fined by: 

(1) n ^^n' iff f CW,F) ^ f (W,F') ; (4.9) 

(2) if f(W,F) = f (W,F') , then n ^^u n' iff • F ^ • F' 

(3) if fOV,F) = f (W,F') and • F = • F' then n ^^u n' 

F' : 



iff W. 



F ^^uW2 



(m) if f(W,F) = f(W,F') and for j = 1,... 

then n ^,u n' iff W F^W , • F' 
a m-1 m-1 



m - 2, w^-F = Wj.F' 
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I£ in addition, W 



• F = W 



• then n - jU n^ • 



We thus see that the merit vector ? is composed o£ i con^Donents, 
each of which is a linear combination o£ some subset o£ m clause fea- 
tures • However, up to this point, we have not related the concept o£ 
distinct g and h functions with that of multiple components of F • 
In order to do this, some components will be defined as "g-type" and 
others as ''h-type" since certain features inherently are a measure of 
the cost of a path from a start node to a node n , whereas others are 
a measure of the heuristic cost of a path from node n to a goal node. 
One would expect a search strategy that utilizes this information to 
perform better than one that does not. Note that for ordering -A, the 
lower subscripted components of F are minimized before those numbered 
higher. Since the heuristic function h should in practice be mini- 
mized before the g function, the user should make sure that the lower 
numbered components correspond to the heuristic canponent h , 

A variation of ordering-A that differentiates more between g and 
h parameters will be called Ordering-B, In this ordering all h com- 
ponents would by convention be the subvector OV-|i • F , W2 • F,,,.,Wj • F) 
and the g components would be (W. , • F , W. ^ • F,,,,,W. • F) where 
there are j h-components and i-j = k g-components for the transfor- 
mation T^: e"^ -V E-^ , Let 



for node n ; H' and are defined sijni'^arly for node n* , 

Diagonal and upwards diagonal merit for Ordering-B can thus be de- 
fined by: 




and G = W • F 



1 



(4-10) 
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(1) n^^un- iff f CW,F) s fCW,F') ; (4.11) 

(2) if fCW,F) = f(W,F') then n ^^u n' iff H ^ H' ; 

(3) if f(W,F) = f(W,F') and H = H' then n ^^u n' iff 
^^+1 • F ^ ^^+1 ' ^' (i-e., we are comparing the first g 
components) ; 

(k+1) if fOV,F) = fOV,F') , H = H' and for = j+1,. . ., i^2 , 

^£ ' ^ " ^£ ' ^' " V iff ■ p ^ ^i-i • F' ; 

(k+2) if in addition, .j^ • F = W^_j • F' then n ^^u n' iff 
Wj^ ■ F ^ Wj^ • F' (i.e., we now start discriminating on the 
basis of h components); 

(i) if f(W,F) = faV,F') , H = H' , and for Z = j+1 1-1 
• F = • F' and for & = 1 ^ 2, W • F = W„ • F' 

then n ^^u n* iff W^._^ • F ^ W^.^^ • F' . 

If in addition • F = W^. • F' then n = =^u n' . Thus, whereas 

Ordering-A treats all merit parameters alike (except for their order in 
the sequence Fj^,F2, . . . ,R defined by the user), Ordering-B groups all 
h-parameters together and minimizes their sum before resorting to pair- 
wise parameter comparisons. At the present time, it is not clear which 
method leads to a more efficient search strategy. Further experimenta- 
tion will hopefully help us in this regard. 

It is interesting to note several special cases of the above two 
orderings that result when the transformation x^: e"^ -j^ is varied. 
In all these cases, however, Ordering-A is identical to Ordering-B 
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since only two components of F are allowed. 

The above merit order ings are identical to a strictly diagonal 
merit ordering (i.e., only values o£ £CW,F) are compared) when 
i = 1 and t^: ]^ is used (equation 4.5). For i = 2 , 

e"^ is the transformation used. If in addition, m = 2 and 

Wjj^ = , the orde ing defined is the same as that described by 
Kowalski [1970b] . Thus, if f . = g and f ^ = h 



'■(::)■(;■.) ■ c;) 

and 

f (W,F) = f^ + f2 = g + h 

A similar case occurs when m=2,fj^ = g,f2 = h, w^^j^ = (l-w) , 
w^^ = a)(0 ^ 0) ^ 1) , and w^^ = w^^ = 0 . Then 



/l-u) 0\ /g \ /(l-w)g\ 

Voo)/ \h/ Va)h/ 

and f(W,F) = (l-a))g + wh corresponds to the representation of evalua- 
tion functions given by Pohl [1970] . A more general case that can be 

made to conform to Pohl's representation is for t^: ^ E and 

i ^ 

m ^ 2 . Recall that we defined f CW,F) to be J W . • F . If i = 2 , 

j=l 3 

Wj^ • F may be thought of as g and ^2 ' ^ ^ * where g and h 

are composite functions. If we now assign additional weights (l-w) 

and 0) to g and h respectively, we can redefine f as 
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2 

f OV,F) = I a.W. • F = (1-0)) W. • F + a)W. • F (4.14) 

where a-|^ = l-w and a2"w,0^a)^l« 

The advantages o£ using a representation such as (4*14) are four- 
fold • 

(1) As stated previously, it seems intuitive that a search stra- 
tegy that differentiates as much as possible between g and 
h parameters should perform better than one that does not. 
The ordering of (4.14) enables us to do this. 

(2) Results have been obtained by Kowalski for the case of 

m = 2 , i = 2 with respect to admissibility and optimality. 
We therefore know a great deal about how such a search stra- 
tegy should behave. 

(3) The general definition of "upwards diagonal merit ordering de- 
scribes an algorithm for ascertaining whether or not a node 

n is of better or equal merit than another node n* . We 
have found that such as algorithm is fairly efficient for i = 2 
but rapidly becomes computationally inefficient for i > 2- . 
We \>ould therefore prefer to use only two components to keep 
our search overhead as low as possible. 

(4) It seems very likely that the same amount of discrimination 

between clauses achieved by the transformation t : e"^ e"^ 

m 

may be obtained through judicious choice of the weights in W 
for the transformation -^^ E . If this is the case, 

equation 4.14 is the preferable ordering to use. 

It is planned that future versions of MRPPS will allow all of the 
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above merit orderings including that o£ Equation 4.14. However, the 
current implementation allows only Ordering -A and Ordering -B with an 
identity weight matrix W used in conjunction with a transformation 
T^: e"^ e"^ for m = (Section 4.5,3 lists the clause features 

currently available). In particular MRPPS now gives the user the cap- 
ability of selecting (1) the ordering to be used, (2) which parameters 
to treat as ''g-type" and which to treat as "h-type", and (3) which 
clause feature to assign to each component of F . Note that (2) is 
only significant if Ordering-B is chosen. 

It should be emphasized that insufficient attention to points (1)- 
(3) above can lead to a very inefficient proof procedure. For instance, 
if level were considered an "h-type" and length as a "g-type" parameter, 
the search process would tend to generate long clauses before shorter 
clauses quite contrary to the meaning of heuristic distance! 

4.2,2 The Completeness, Admissibility, and O p tlniality of Ordering -A 
and Ordering-B 

It is the purpose of this section to show the compatibility of 
these orderings with the diagonal and upwards diagonal orderings de- 
fined by Kowalski for the function f(n) = g(n) + h(n) • Let and 
^^u be diagonal and upwards diagonal orderings so defined (refer to 
Equation 3,1) and let J be a search strategy. Step (1) of Equation 
4,9 for Ordering-A is essentially identical to Step (1) of Equation 
3,1 that defines a diagonal merit ordering. The other steps of 4.9 
serve to discriminate more finely between clauses of equal f values. 
Thus, all clauses generated by I using for a given f will also 
be generated by I using Ordering-A for that f value. Thus, J with 
Ordering-A is certainly complete when I when is complete. Also, 
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Ordering-A permits at least as "good" or minijnal a solution as 

a 

does, and also allows I to be as optimal (i.e., as well informed) as 

allows I to be. I£ the heuristics are such that in advance o£ 
performing resolution, the sets to which the resolvent belongs is 
known with certainty, and the conditions on the heuristic specif ied» be- 
low are met, then the algorithm is admissible since the null clause is 
found on some diagonal. Each diagonal is explored befvore going on to 
tlie next one. Hence, admissibility camiot be violated, 

A similar correspondence can be made between Ordering-B and ^^u. 
Steps (1) and (2) of Equation 4,11 are essentially identical to Steps 
(1) and (2) of Equation 3,1 that define an upwards diagonal merit order- 
ing. Again, the remaining steps of Equation 4,11 serve only to dis- 
criminate more finely between clauses with equal . f values and equal 
H values ( recall that H is the sum of all h-type components of F ) , 
I with Ordering-B is therefore complete wheir f with ^^u is cojnplete. 
Also, Ordering-B is admissible for the same reason that Ordering- A is 
admissible. 

It should be noted that the canpleteness , admissibility and 
optimality of I depends i:?)on the characteristics of the heuristic 
components used in the evaluation function F , For instance, if only 
the parameter of clause length is used as the heuristic measure, I is 
not even complete and therefore not admissible or optimal. Although we 
do not feel that admissxL^lity and cptimality are crucial for question- 
answering systans, refutation completeness is a more desirable property. 
By refutation completeness we mean that I will generate the null 
clause □ whenever a contradiction does indeed exist (although the 
entire search space need not be generated as is required for complete- 
ness). Thus, great care should be taken in choosing the heuristic com- 
ponents . 
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In relation to the above, Kowalski has shovoi [1970b] that diagonal 
and upwards diagonal merit orderings allowj to be admissible i£ the 
following two conditions are met : 

(1) (or must be 6-£inite; that is, for any node n' in 
our search space, I will generate only a finite number of 
nodes of better or equal merit; and, 

(2) that part oi (n) designated as the heuristic function 
h(n) must satisfy the lower bound condition; that is, 

h(n) ^ h*(n) for all n in the search space, where h* is 
the actual heuristic cost of node n , and h is only an 
estimate. ' 

In particular, if and ^^u are 6-finite, diagonal and iq)wards dia- 
gonal search are complete. Optimal ity follows when h(n) satisfies 
conditions (1) and (2) above and also a monotonicity condition, namely: 

(3) f(nO < f(n) for nM n and f(n*) = g(n*) for a null 
clause n* , where n' -< n means that node n' is generated 
before node n . 

4,3 . Heuristic Measures Used in MRPPS 

4.3.1 Clause Level 

It is often advantageous to define the evaluation function f 
with a component being clause level. The level of a clause is defined 
as: 

iO if C is a base clause 
max(£(C^), 9.{C^) + 1 where C = Resolvent (C^,C2) 
£(C^) + 1 where C = Factor (C^) . 

The level of a clause C is a measure of how many inference steps were 



required to derive C from the base clauses. Thus, if level were the 
only component o£ £ used with an algorithm such as I y a breadth- 
first search would result since all derivations of level i muld be 
formed before those of level i+1 . Obviously it should not be used 
alone. However, it can be valuable when used in conjunction with other 
components since it tends to keep the search from running away in an 
infinite deduction patL, 

4.3.2 Clause Length 

The number of literals of a clause C is a very rough measure of 
the cost incurred in inferring the null clause from C . This heuris- 
tic has been used in many other theorem provers (e.g., Garvey and Kling 
[1969], Norton [1972]), and is a very effective component of fCn) . 
The system QA3.5 used only length as a heuristic function by employing 
the unit preference strategy (Wos, Robinson, Carson [1964]). However, 
length alone, or even in conjunction with clause level does not lead to 
a practical heuristic, as seen in Section 3.3. In combination with 
other components, though, length will help the search for a con- 
tradiction. 

It is interesting to note that if length is the sole heuristic 

it 

used with the I algorithm (or with the Q* algorithm of MRPPS, to 
be described later), a different search results than for unit prefer- 
ence as defined in QA3.5. Recall that in the latter, all units are re- 
solved against other units to produce all inferences possible. If 
none are found, all length two clauses are resolved with all units, 
then length three clauses, etc. On the other hand, I would resolve 
units with units, then two-clauses with units, then two-clauses with 
t\vO-clauses, then three-clauses with units, etc. Thus, I tries to 
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minimize the length o£ resolvents, rather than always attanpting to 
reduce clause length as does the unit preference strategy. 

4. 3*3 Clause Complexity 

A simple definition of functional complexity may also be used as a 
heuristic component of f . The complexity of a clause is defined to 
be the maximum level of function nesting taken over all predicates of a 
clause. For example, the following list gives samples of the complexity 
of several clauses: 

Clause Complexity 

P 0 

P(x,y) 1 

P(fCgCa)),fCz)) VQ(c) 3 

P(£(a),g(b),h(c,x)) 2 

The complexity heuristic is as follows: if C^ and C2 are clauses 
with complexity h-|^ and h2 respectively, and h^, < h2 , then it is 
better to generate successors for C^ rather than for C2 . In other 
terms, we would like to keep our proof as "simple" as possible with re- 
spect to clause complexity. 

On the other hand, there is a very good reason to generate clauses 
of high complexity before those of low complexity. Clauses t!iat have 
deeply nested functions will probably resolve with a very small number 
of other clauses whereas a simple clause with variables is apt to re- 
solve with a large number. 

This effect is conpounded when a complex clause is resolved with 
another clause by substituting terms containing constants for other 
terms^ The resulting clause is partially instantiated and thus unlikely 
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to resolve with many clauses. Since we are trying to keep the number 
o£ clauses generated to a n^inimum, this seems like a reasonable approach 
to take. 

In any case, the above definition o£ complexity does not seem to 
distinguish finely enough between clauses and we do not know whether or 
not it will be useful. Perhaps a definition that sums the complexities 
of all predicates of a clause would yield a better heuristic. Future 
versions of MRPPS will utilize different complexity definitions. 

4.3.4 Cluster Distance 

The heuristics described in the previous sections are syntactic in 
nature. That is, they measure properties of a clause which can be de- 
termined by simply examining the clause and measuring how much of this 
property is present, e.g., how many literals are in the clause. Pre- 
vious work in theorem proving has also made use of these types of 
measures and although they have proven theoretically useful, they do 
not provide sufficient direction to the search to permit its use in 
practical situations. 

In the present section we shall define a heuristic measure which 
is more semantic in nature. The measure will take into account the na- 
ture of the general axioms in the system's data base and will- provide 
some insight as to when in the search certain of the axioms are likely 
to be useful. As will be demonstrated by an example, the heuristic 
should prove useful in restricting the number of clauses generated dii- 
ring the course of the search. The use of clusters as a heuristic, 
ivas proposed by Minker in'. Minker and- Sable [1970] • It has been defined 
more precisely in this report. 

Before we are able to define what we mean by cluster distance, 
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some preliminary discussion will be necessary to establish what we mean 
by semantic clusters relative to which the distances are measured. 

In the data base o£ a question-answering system we shall have a 
set A of general axioms given in clause form, A = {A-,A^,..,,A } , 
By taking an accounting of the predicates occurring in these clauses we 
may construct a set P = ^Pi^P2^ • • • ^^m^ predicates occurring in A . 
We may then define an axiom-predicate matrix = [a^^] as the follow- 
ing: 

r 

1 if predicate P^. occurs at least once in axiom Ai , 

regardless o£ whether it appears as 
a negated literal or not negated 

0 otherwise 



a. . = ^ 



We may now define a predicate-predicate matrix Bp = [b^j] as: 

= a! • A • 

P P P 

where aJ is the transpose of . 

The matrix Bp indicates the degree of r elatedness of predicates 
in the following sense: the value of the element b^j of Bp gives 
the exact number of axioms in which both predicates P^ and Pj occur. 
If b^j ^ 0 , we may say that P^ and Pj are related. But, if 
b^j = 0 , then P^ and Pj do not co-occur in any axiom. It may be 
that b^j = 0 , but that there is some P-^ such that h^^ f 0 and 
b, . ^ 0 in which case since both P- and P. are related to Pv they 
are somewhat related to each other. In general, there may be predica- 
tes Pj^ ,P^ ^'''^^k ^^^^ ^ik ^ ^ ^k j ^ ^ 
1 2 r 1 r 

bv V 0,. ..jb, . f ^ in which case we might say with that P. 
^1^2 ^r-1 r ^ 

and Pj are rather tenuously related. If there were no such chain of 
predicates between P^ and Pj we would say that they are completely 
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unrelated. 

The relationships described by the matrix Bp may be depicted by 
a graph. To Bp there corresponds an undirected graph Gp , which we 
term a semantic graph j containing one node per predicate and an edge 
between predicates and Pj iff h^^ ^ 0 . In general, the graph 
will consist of a number of connected components, that is, o£ individual 
subgraphs which are not connected to each other. We term these connec- 
ted components semantic clusters • 

We provide the following specific example to make the above notions 
more concrete: 

The set A of general axiom clauses: 



Al. 


Pg(x,y) V Pg(£(x),y) 


A2. 


P2(x,y) V P3CX) V Pj(v,x) 


A3. 


P3(x,y) V P^Cx) 


A4. 


Pj(x,v) V P^Cy,z) V P2(x,z) 


AS. 


Pg(x,v) V P^(x) V p^(y) 


A6. 


PgCx.x) V P^(x) 
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The Set P of Predicates: 
P ~ tPj^ »P2» • • • 

From A we find that the axiom-predicate matrix A has the value: 

P 

000 0 0101 
11100000 
00110000 

A = 

P 11000000 
00000110 
00011000 



From Ap we find the predicate-predicate matrix has the value: 

22100000 

22100000 

11210000 

00121000 

T 00011000 
B_ = A ' ♦ A„ = 

00000211 
00000110 
00000101 



T 



IT" 
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The sonantic graph Gp corresponding to Bp is shown below as consist- 



ing o£ two semantic clusters: 




" Pa p. ^Py /6 ^8 



Cluster I Cluster II 



FIGURE 2. A SEMANTIC GRAPH WITH TUD^LUSTERS 

We note that the clusters give a graphical account of the inter- 
relatedness of the various predicates* For example, we see that ?^ , 
?2 y P3 are closely interrelated, but are less closely related to, say, 
Pg . Also, the predicates in Cluster I are completely unrelated to 
those in Cluster II* 

Suppose we are presented with a query, either posed in, or conver- 
ted to the clause form of the first -order predicate calculus* Suppose 
that all of the predicates in the query occur in, say. Cluster I* 
Then, any axioms which give rise to Cluster II, i*e*, axioms containing 
instances of P^ , Py , or Pg , are irrelevant to responding to the 
query and need never be considered for this purpose* Since the 
"cluster 11" axioms, and any of their successors cannot logically in- 
teract with the query clauses, they are never used in the proof* Less 
obvious advantages of using the clusters will be discussed presently* 

We note that in the above definitions we have used only the 
general axioms as the basis for forming the semantic graph. We believe 
that the facts stored in a QA System's data base will be predominantly 
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in the form o£ fully- instantiated unit clauses which do not, in the 
above sense, interact with other axioms. Should any of the facts be re- 
presented in clauses of length greater than one, then these too would 
be included in the axiom-predicate matrix. Furthermore, in the event 
that there are fully- instantiated unit clauses whose predicates do not 
occur in the general axioms, explicit entry of these predicates into 
the axiom-predicate matrix would be made. Such predicates would give 
rise to isolated points in the semantic graph. 

At this point we are now ready to define what we mean by cluster 
distance. Two alternative definitions will be presented, the min 
cluster distance of a clause and the max cluster distance. These dis- 
tances are measured from a clause to a set of clauses. The set of 
clauses we measure the distance to will be those derived from the nega- 
tion of a query. The distance is measured relative to the semantic 
clusters developed from the set of general axioms. 

Let A be a set of general axioms, and let be the semantic 
graph derived from A as described above. Before defining the cluster 
distance of a clause from a set of clauses, we define the usual graph- 
theoretic distance measure: 

DEFINITION 1. (Graph-Theoretic Distance) 

Let p. and P. be two nodes (predicates) in G . Then the 

(graph- theoretic) distance between P^ and P^. , denoted 5(P^,Pj) , 

is the length of the shortest path in G from P^ to P. . If P^ 

and P. occur in different clusters of G , then 6 (P., P.) is set to 
j p 1 J 

DEFINITION 2. (Max Cluster Distance) 

Let C be a clause in idiich there occurs the predicates 
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P^,P2,..,Pj^ , and let Q be a query in which the predicates Q]i,Q2, 
Then the max cluster distance of C from the query 
Q , denoted v^CC) , or simply vCC) when Q is understood, is given 

Vf,(C) = max {min {6CP.,Q.)}} . 
P^ € C € Q ^ ^ 

It will be shown below how this distance measure may be used to advant- 
age by a heuristic search algorithm. For the moment, we content our- 
selves with some example computations of the max cluster distance o£ a 
clause. 

EXAMPLE 

Let the negation of the query, which we denote by {'\JQ} , and the 
clause C be as given below: 

P^ V p^ 
C :. P^ V P3 

Then VgCC) , relative to the semantic graph of Figure 2 is cal- 
culated below: 

Vf,(C) = max {min {6(P.,P.)}} 
^ P^ € C Pj € Q ^ ^ 

= max {min {6CP^,pp, 6CP^,P2), min {6CP5,Pp, '>CP5,P3)}) 
= max {min {0,1}, min {3,2}} 
= max {0,2} 
= 2. 

Thus, clause C is a max cluster distance of 2 from query Q . 
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EXAMPLE 



Suppose that Q and are as given in Figure 2, and the 
clause C is V . Then we have 

7q(C) = max {min{6(P7,P^), , min{6(Pg,P^) , fiCP^.Ps) 

= max {min{«>,<»} min{ «>,"}} 
= max {","} 



If clause C happened to be an axiom in the data base, then one could 
tell by its max cluster distance fron the query that it is irrelevant 
in processing the query. 

EXAMPLE 

Suppose we have 



C 



G 

P 



Pi VPs 
-P4 V Pg 



as above 
Then we have 



Vq(C) = inax{min{6(P4,Pp, «(P4»V^* min{6 (Pg,Pp , {(Pg^Pg))) 
= inax{min{2,«},min{«,l} 
= 2. 

Thus, we see that if each of the predicates of the clause occurs in a 
cluster with at least one of the query predicates, then the max cluster 
distance of the clause from the query will be finite. 

A distance measure which is analogous to the above is the min 
cluster distance: 
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DEFINITION 3. (Min Cluster Distance) 

Given the assumptions o£ Definition 2, above, the min cluster dis- 
tance o£ C from query Q , denoted Aq(C) , or simply aCC) when Q 
is understood, is given by 

An(C) = min {min {6(P.,Q.)}} . 
^ € C 6 Q ^ 

Below we will show how the min cluster distance may be used in directing 
a search, 

EXAMPLE 

Suppose we have the unsatisfiable set of clauses {S,T,Q,P,RT,PQ,QRS'} 
The semantic graph for these clauses (based on the three non-unit 
clauses) is 



P Q R 

% - ' — ■ — - 



\/ 

S 

Assume that the negation of the theorem is V . Then the min cluster 
distances relative to P are: 

c stqprtpqqrS' 

A(C) 2 3 1 0 2 0 1 

In this example, assume the merit function is simply the composite of 
length and min cluster distance and that each A(i,j) corresponds to 

it 

A(length,A) . The algorithm will behave like the \ algorithm (section 
3.2) in all respects except for the substitution of A for level. 

1. Fill (0,0) is unsuccessful: A(0,0) = cp 

2. Fill (0,1) is unsuccessful: A(0,1) = cp 
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3. Fill (1,0) is successful: A(1,0) 
Recurse(P) is unsuccessful 

4. Fill (0,2) is unsuccessful: A(0,2) 

5. Fill(l,l) is successful: A(l,l) 
Recur se(Q) is successful: 

6. Fill (2,0) is successful: A(2,0) 
Recur se(PQ) yields Q : A (1,1) 
RecurseCQ) yields □ : A (0,2) 

A summary of the steps in the search follows. 

1. P6A(1,0) 

2. Q.6 A(l,l) 

3. PQ 6 A(2,0) 

4. Q 6 A(l,l) (from 1,3) 

5. □ € A(0,2) (from 2,4) 

it 

If the search were performed by the I algorithm using length and 
level, the following steps would have occurred; 



1. 


S € A(1,0) 




2. 


f € A(1,0) 




3. 


Q € A(1,0) 




4. 


P € A(1,0) 




5. 


RT € A(2,0) 




6. 


R € A(l,l) 


(from 2,5) 


7. 


PQ € A(2,0) 




8. 


Q € A(l,l) 


(from 4,7) 


9. 


□ € A(0,2) 


(from 3,8) 



= {P} 

= cp 

= {Q} 

= {PQ} 

= {Q,Q> 

= { □ } 
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IVhile the above example handles just a trivial problem, we can 
observe from it how the use o£ the cluster distance discriminates, among 
clauses o£ equal length, in favor of those which are more "closely re- 
lated" to, and in this sense, more relevant to the query. On the other 
hand, the I algorithm, using only length and level cannot accomplish 
this sor^ of discrimination and so it generates all of the unit clauses 
before bringing in the two-clauses. Of course, the use of clustering 
would also permit these unit clauses to enter if the search were any 
deeper. However, if there were any base clauses whose predicates are 
not in the semantic cluster with query predicates, such clauses could 
be generated by I , but not by the cluster distance based algorithm. 

Aside from piohibiting base clauses which are unrr'^ated to the 
query from entering the search space, the cluster distances appear to 
be useful heuristics for still other reasons. In the first place, 
they will have the effect of sequencing the ci\ioms which will enter the 
search in such a way that when an axiom is generated, it has a good 
chance of being used in an inference. Secondly, it has the effect of 
ordering the inferences in such a way that the clauses which are more 
closely related to the query are used in inferences before more "dis- 
tant" clauses. Whether the min cluster distance is better than the max 
cluster distance or vice versa, must be determined by experimentation, 
as indeed, must the question of whether or not this measure will have 
any utility at all for question answering systems. 

Cluster analysis may be used in another way. If one finds clusters 
of the clauses tliat result from the axioms, then these clauses should be 
located near one another^ This is particularly true when the data base is 
stored on peripheral devices. Then, when one clause is brought into core, 
a clause in the same cluster might also be brought into core. 
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4.4 The Base Clause Selection Strategy of MRPPS 
4.4.1 General Discussion 

It was noted in Section 3.1 that search algorithms such as the A* 
or I algorithms require that the merit of all clauses in the starting 
set S be established prior to the start of the search so that the 
clauses may be ordered by their merit. This requirement has at least two 
disadvantages. In the first place, if the merit of a clause is to depend 
upon the query, as is the case with the cluster heuristic, then the merit 
of clauses in the data base must be calculated, and the data base reordered, 
at the start of every search. For any "reasonably" sized data base, this ^ 
situation would clearly be intolerable. Furthermore, and of even greater 
significance, the static assignment of merit to the base clauses at the 
start of the search ignores relevance clues which may become evident during 
the course of the search or which may be provided initially. In fact, as 
it will be shown, it is often possible to glean from a given search step 
which of the available base clauses would be'^best", in some sense, to 
generate (add to an A- set) next. As a trivial example of this, suppose 
that the search strategy has somehow generated the unit clause ^^FCJack, 
Sally) , and is now ready to bring in an axiom from the data base. 
Suppose that the merit function is composed of clause length and clause 
level. Then, with respect to this merit function, the clause M(Rita, Mike) 
and F(Jack, Sally) would be indistinguishable, as would be the large bulk 
of unit clauses in the system* s data base. Thus, the clause brought in 
next would be a matter of how the clauses happened to be ordered even though 
the clause F(Jack, Sally) , if brought in, would lead to an iranediate 
refutation. Similarly, the use of the clusterrig heuristic, while it might 
restrict consideration to only the unit "F" clauses, the obvious best 
clause woujd be indistinguishable from a possbily large list of other such 
clauses. 

ERLC 
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The above observations have led to the development of the Q* 
search algorithm which embodies two search strategies. One strategy, 

called the deduction strategy j (or the deduction algorithm) , determines 
which inferences to attenpt next and forms deductions by resolution, 
factoring and paramodulation. This is essentially the I algorithm 
(Section 3.2) with several modifications (to be described in Section 4.5). 
The second strategy is the base clause selection strategy (or algorithm) 
which determines the next base clause to be generated. It is this latter 
strategy that is the subject of the present section. 

The base clause strategy is in part based upon the premise that no 
axiom should be generated unless, at a minimum, it possesses a literal 
which will unify with a literal of a clause already generated. At that 
time, a base clause may become a ''candidate for generation." For certain 
inference systems stricter constraints may be imposed. For example, if 
the inference system is, or includes, set- of -support, then only axioms 
which could interact logically witli a supported clause will be considered 
for generation. In this case an axiom will be considered for generation 
if it possesses a literal which is opposite in sign to and unifies with a 
literal of a supported clause which has already been generated. Similar 
restrictions for other inference systems are possible. However, the re- 
strictions imposed are such that the refutation completeness of the Q* 
algorithm is preserved. 

From the above comments it should be evident that the base clauses 
are not initially assigned a merit. Rather, each base clause is initially 
assumed to have an undefined merit, which is only to becone defined when 
the clause becanes a candidate for generation. Furthermore, not all 
candidates will have their merit calculated explicitly when they become 
candidates. In general, their merit will be defined implicitly by 
^ virtue of their membership in a list of clauses of equal merit. 

ERIC 
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For example, suppose a clause has been generated which contains the 
literal F(x,y) . Then the list o£ data clauses F(a^,b^), F(a2,b2), 
. . . ,F(aj^,b^) would all become candidates for generation. But ail of 
these clauses have the same merit with respect to length, level, complex- 
ity, and cluster distances, and so we need only assign a merit value to 
the list which applies to all of its members. By this use of dynamic 
merit calculation for base clauses and implicit clause merits, we are 
able to avoid a great deal of computation. 

The use of these lists of base clauses has another striking advant- 
age which is central to the base clause strategy. Suppose we have one 
list of candidate base clauses which have become candidates because of 
the generated literal 'vF(x,y) , and another list of candidates because 
of the generated literal 'vF(x, Sally) . For example, one might have: 
'vF(x, Sally) points at the list F(Jack, Sally) , and yT^(x,y) points 
at the list F(a-|^,b-|^) , F(a2,b2) , . . . ,F(aj^,bj^) . Then, while all of 
these data axioms have equal merit with respect to the merit function 
used by the deduction strategy, the base clause strategy will rank 
F(Jack, Sally) above the other clauses. This is done by virture of 
the presence of the constant ''Sally" in the generated literal. Thus, 
if we are asked a question about an individual, we wish to explore 
avenues which are relevant to that individual before we explore other 
avenues. If, by pursuing this policy we come across another individual, 
then that individual, being relevant to the first, will also be used to 
direct the search. 

From the above discussion we see that the query will have a strong 
influence on the search. Its literals will be used to establish which 
base clauses should become candidates for generation, and its constants 
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will be used to order the candidates. Generated clauses will be used 
in the same way. Thus, if we have a constant in one literal of the 
quer)', resolving on this literal will often have the effect of forming 
an instantiated literal in the resolvent whose occurrence in the resol- 
vend was uninstantiated; the term used in the . instantiation will be re- 
levant to the query constant. By transferring these clues from parent 
clauses to generated clauses by means of instantiations, the generated 
clauses can have a profound effect on the search for relevant base 
clauses . 

In Figure 3 is depicted a refutation which could easily have been 
derived using the above ideas. In that example, suppose the query was 
"IMho is Sally's mother's mother-in-law?" As a wff, this query might 
have been phrased: 

(3x)(3y)a^(x,y) AM(y, Sally)) . 
In clause form, the negation of the query would be 

M:(x,y) V M(y, Sally) . 

Suppose we had among the general axians the two axioms 

(a) M(u,v) VH(v,w) VML(u,w) , and 

(b) F(x,y) V M(z,y) V H(x,z) . 

(a) states that if "u" is the mother of "v" and if "v" is the hus- 
band of "w" , then "u" is the mother-in-law of "w" , while (b) states 
that if "x" is the father of "y" and "z" is the mother of "y" 
then "x" is the husband of "z" . Suppose also that there are a 
large number of data axioms describing vdio is the mother and viio is the 
father of particular individuals, among which were the three needed for 
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ML(x,y) V M(y. Sally) 
W:(x, Rita) 



M(u,v) V H(v, Rita) 



M(u,v) V F(v,y) V ;?(Rita. y) 



M(u,v) V F(v, Sally) 



M(Rita. Sally) 




M(u,v) V H(v,w) V ML(u,w) 




F(x,y) V M(z,y) V HCx,z) 
M(Rita, Sally) 
F(Jack. Sally) 
M(Rose, Jack) 
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FIGURE 3. REFUTATION BASED ON INSTANTIATED LITERALS 
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the refutation: MCRita, Sally) , MCRbse, Jack) , and F(Jack, Sally) . 

Under these circumstances the base clause strategy will assure that the 
necessary base clauses are available when they are needed • Note that, 
in violation o£ the J]* algorithm, the two axioms o£ length three will 
be generated in this case before other axioms o£ length one since the 
base clause strategy deemed them to be most relevant at that stage o£ 
the search. 

As a final, note we observe an iji5)ortant characteristic o£ the Q* 
algorithm derived from the base clause strategy. In the case that there 
is an immediate answer to a question, as in the case of a direct look- 
up type question one encounters with conventional data management sys- 
tems, then the Q* algorithm will find it immediately. For exaiiple, 
stppose one asks the question, 'IVho is the father of Sally?" which 
corresponds to the wff (3x) F(x, Sally) . The negation of this query, 
F(x, Sally) will be entered into an A-set and the very first clause 
provided by the base clause strategy will be F(Jack, Sally) . Thus, 
in one step the search will be concluded successfully making this ap- 
proach conpetitive with generalized data management systems which are 
basically table look-up methods. This property is noted by Coles [1969] 
to be an important one for QA systems, i.e., answering simple questions 
quickly and efficiently. In this connection, the actual answer to this, 
or any other question, will be available in the system by means of the 
Luckham-Nilsson answer extraction algorithm [1970]. 

4.4.2 The Base Clause Selection Algorithm 

The base clause selection algorithm attenpts to provide direction- 
ality to the overall search for a refutation. This is attenpted by 
using the literals and constants occurring in the query and in subse- 
quently generated clauses as clues to the relevance of axioms in the data 
Q base. Subject to certain constraints to be described below, each 

ERIC 
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literal o£ a generated clause gives rise to an entry on a list called 
the SPECLIST (for "specification list"). The list is so named since 
each literal entered onto the list is regarded as a specification for 
that set of axioms which, in the least restrictive case, contain a lit- 
eral which is, or whose negation is, lanifiable with the "spec literal". 
Each entry on the SPECLIST points at a list of one or more axioms which 
have been found to satisfy the specification. The axioms pointed to by 
the entries on the SPECLIST are candidates for generation. The SPECLIST 
is ordered so that "better" candidates come first; only candidates are 
available for generation. 

As previously stated, that part of the Q* search algorithm which 
is concerned with generating clauses (i.e., adding to A- sets those 
clauses obtained from the base clause algorithm or from logically inter- 
acting clauses already generated) will be called the deduction algorithm. 
The deduction algorithm communicates with the base clause selection al- 
gorithm in two distinct ways; one way is direct, the other is indirect. 
Each time a clause is generated, that clause together with its merit 
(actually, a pointer to the clause and a pointer to its merit) are 
placed on a list called USPECS (for "unprocessed specs"); this is the 
indirect means of communication. Each time the deduction algorithm 
finds it appropriate to try to generate a base clause of merit better 
than or equal to M (or sometimes strictly better than M) the base 
clause routine BASEC' (for "base clause") is called to attempt to satisfy 
the request; this is the direct means of communication. When a request 
is made for a base clause satisfying a given merit condition, all of 
the entries on USPECS, if any, are processed to produce zero or more 
entries for the SPECLIST. Once the SPECLIST has been updated, BASEC 
determines whether or not it can return the "best" clause. If it can, 
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it does so; otherwise, it exits with failure. In the following two 
sections we describe the major functions perfcmed by the base clause 
selection algorithm, namely, creating SPECLIST entries, and providing 
clauses to the deduction algorithm. 

Creation of SPECLIST Entries 

Each literal of the clauses found on USPECS, that is, of the 
clauses entered into A-sets by the deduction algorithm, is used as a 
basis for creating an entry on the SPECLIST, Each SPEaiST entry con- 
tains the following information: 

a. The address of the literal upon which the SPECLIST entry is 
based; this literal will be called the spec literal. 

b. The address of the clause containing the spec literal; this 
will be called the spec clause. 

The address of the merit vector for the spec clause; this merit 
will be called the spec clause merit. 

d. The address of the first axiom of a list of one or more axioms 
which are said to satisfy the specification; this first .-ixiom 
will be called the spec axiom. The merit of the spec a>:iom 
will be better or equal to the merit of all other axioms in 
this list, where this measure of the merit is the same as that 
used by the deduction strategy. 

e. The address of the merit vector for the spec axiom; this merit 
will be called the spec axiom merit. 

f. The address of the merit vector giving the (predicted) upper 
bound merit of a resolvent of two clauses having merits equal 
to the spec clause merit and the spec axiom merit respectively; 
this merit will be called the spec upper bound merit. 
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g. A code number, called the spec type^ which has the value 1 if 
the spec literal contains a constant, has the value three if 
the spec literal contains a function symbol, but no constants, 
and has the value five if the spec literal contains neither 
constants nor functions. 

Each clause on USPECS is processed in turn, and is removed from 
USPECS when it has been completely processed. The exact nature of the 
processing will depend upon the inference system(s) in force. If set- 
off support is in force either by itself, or in combination with other 
inference systems, or imbedded within an inference system as it is with 
SL-resolution, then any clause found on USPECS which does not have 
support will be removed from USPECS and not processed at all. If the 
clause does have support, then those of its literals which may validly 
be used in an inference (and which are not alphabetic variants of spec 
literals already contains in the SPECLIST) will be matched against the 
axioms. Axioms containing a literal which is opposite in sign to and 
unifiable with the (potential) spec literal will become tl\e axioms 
pointed to by the SPECLIST entry. An axiom with the best merit will be- 
come the spec axiom. Note that with an inference system such as SL- 
resolution, only the selected literal will become the spec literal. 
Note also that with a suitably chosen data structure, the amount of 
searching for axioms containing literals which are opposite in sign to 
and unifiable with a given literal can be kept to a minimum. 

For other than set -of- support based systems, unsupported clauses 
(i.e., axioms) can also be used to locate axioms. Furthermore, we must 
now be somewhat less selective in considering a>:ioms and ignore the 
signs of the unifying literals. Thus, in non-set-of-support cases an 
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axiom will become a candidate for generation i£ it contains a literal 
which unifies with a literal of a generated clause when the signs of 
the literals are ignored. 

The process, then, for creating an entry for the SPECLIST involves 
finding the list of candidate axioms for the spec literal, assuring 
that one of best merit occurs at the front of the list, and filling in 
the remaining spec entries. The spec clause merit is obtained from 
USPECS, while the spec axiom merit, and the spec upper bound merit must 
be calculated. The spec type must also be determined by scanning the 
spec literal for constants and functions. In this regard, we have the 
capability which allows a user to specify that certain constants are to 
be treated as variables. Thus, certain non-specific, class-specifying 
types of constants which occur in a large subset of the data axioms can 
be prevented from getting the same privileged status given to more 
meaningful constants. In a data base about people, for example, such 
constants as ''Male** or "Female" might be arguments in literals describ- 
ing individuals. 

Once the spec is completed it must be placed on the SPECLIST. As 
we noted earlier, the SPECLIST will be ordered so that the "best" spec 
axiom will be the next one delivered to the deduction algorithm. The 
algorithm orders specs by spec type. A spec of type 1 precedes one of 
type 3, etc. If a constant occurs in the query, this ordering has the 
effect of focusing the search to data axioms in which the query con- 
stants occur, and to general axioms which may interact with the constant 
carrying literals of the query and those of its successors which receive 

constants by instantiation as inferences are made. Among a group 
of specs of the same type we are experimenting with various sub- 
orderings. Among the properties for basing the sub -order ings, we have 
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either tried or may try the following properties: 

a. spec upper bound merit 
b* spec ,clause merit 

c. spec axiom merit 

d. where the spec literal has a constant, order by how "far away" 
that constant is from the query constant (informally, e.g., if 
a resolution on a literal P(a,x) of a clause T(a,x) v Q(x,y) 
causes the literal Q(b,y) to occur in the resolvent, (because 
the other literal taking part in the resolution was P(a,b) , 
then we might say that b is a distance of one away from a , 
and so on) 

e. the number of potential spec literals which were found to be 
alphabetic variants of the given spec literal 

f . the cluster distance (max or min) of the spec literal to the 
query predicates 

g. the degree extracted from the semantic graph that gives 
the number of predicates that co-occur with the predicate of 
the spec literal in the axioms 

h. the ratio of constant arguments to non-constant arguments. 

The "goodness" of the specs might also be defined as a linear combina- 
tion of some of the above properties. 

Notice that, in general, the spec axioms will not be in merit or- 
der (as used in the deduction strategy) . This is a very significant 

it 

departure from the I algorithm, and it is done at the expense of 
losing admissibility. But this sacrifice is only of theoretical impor- 
tance since it would sean to be more important in practical applications 
to find any solution quickly rather than to carry out an exhaustive 
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search and either run out of space or time and find no solution at all, 
or to find a simplest solution at great expense. 
Handling Requests Por Axioms 

There are two modes of operation for the base clause request pro- 
cessor MSEC; these are the "constants on" mode and the "constants off" 
mode. The former mode is used whenever the query contains one or more 
constants, viiile the latter mode is used otherwise. We idll describe 
the latter mode of operation first. 

In the constants off mode, MSEC may receive a request for an axiom 
of merit better or equal to M . If the merit of any spec axiom satis- 
fies this condition, the axiom is returned to the requesting program; 
otherwise, failure is reported. 

Any time an axiom is removed from a SPECLIST entry, the axiom must 
be tagged as in use, since it may be pointed at by other entries (there 
is never more than one copy of any axiom) , and the SPECLIST entry must 
be updated. If the removed axiom is the only one in the list of axioms 
corresponding to the entry, the entry is removed from the SPECLIST. 
Otherwise, an axiom in the list of best merit becomes the new spec 
axiom, the spec axiom merit is revised, a new spec upper bound merit is 
computed, and the revised SPECLIST entry is reinserted into an appro- 
priate position of the SPECLIST. 

Before a spec axiom is returned, its in- use status is checked to 
determine if it has already been generated. If it has, the axiom is 
removed and the SPECLIST is updated as described in the previous para- 
graph. This process will be repeated until the spec axiom of the first 
SPECLIST entry is not in use, or until the SPECLIST is empty. 

In the constants on mode, MSEC can be called either to request an 
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axiom o£ merit better than some merit value H , or to request an axiom 
whose merit is equal to M . In the former case, i£ the spec type o£ 
the first SPECLIST entry is greater than one, or if the spec axiom merit 
of this entry is wrse than or equal to M , MSEC returns to the re- 
questing program and reports failure. On the other hand, if the first 
spec axiom merit is better than M , and the spec type is equal to one, 
the address of the axiom, together with the address of its merit vector 
are returned to the requesting program. 

If the request is for an axiom of merit equal to M , the following 
processing occurs. If the first SPECLIST entry has spec type equal to 
one, and if the spec axiom merit is equal to M , the address of the 
spec axiom is returned together with the address of its merit vector. 
If the first SPECLIST entry is of type one, and the spec axiom merit is 
worse than M and M is worse than a predicted merits PM , (which 
will be described below), then a type three or type five spec axiom is 
sought \\4iose merit is better or equal to M . (Only the first type 
three and the first type five entries need to be checked.) If such an 
axiom is found, it, together with its merit, are returned to the re- 
questing program. In any other case, failure is reported to the reques- 
ting program. 

In the constants on mode only, whenever a type one axiom is return- 
ed, its corresponding spec upper bound merit (i.e., the upper bound of 
the merit of a resolvent of two clauses whose respective merits are 
those of the spec axiom and the corresponding spec clause) will be 
saved in the merit vector PM if the current Pf4 value is better 
(<ju) than the upper bound merit. (PM is initialized with the zero 
vector when the system is presented with a query.) On subsequent calls 
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to BASEC, PM is used to keep other axioms fran being returned until 
the requested merit value, M , is worse than PM . This guarantees that 
the generated type one axiom which gave rise to the PM value has had a 
chance to interact logically with the correspaiding spec clause. The 
result o£ such interaction may be (hopefully, it will be) the transfer- 
ence of one or more constants of the spec literal and/or spec axiom in- 
to literals of the resolvent. Such literals would then be used to 
cause additional axians to be considered for generation. 

4.5 The Deduction Strategy of MRPPS 
4.5.1 Introduction 

The deduction strategy used' in MRPPS is closely modeled after the 
I* algorithm of Kowalski [1970b] . The evaluation function f is de- 
fined by the user at the time of a proof. In particular, the user may 
specify m conponents for the feature vector F , as well as a linear 
transformation matrix W (which currently is an identity transform by 

default) . If the transfoimed feature vector is F = ff .f f ) = 

12* * ±^ 

W • F , then the merit for a clause generated during a proof is 
M = (^^19^2*"'* '^i^ • merit orderings ^^u used are defined by 
Equations 4.9 and 4.11 (refer back to section 4.2.1). 

At the present time, the following five parameters are available 
to the user for selection as part of the evaluation function F ; 

1) clause length; 

2) clause lex'-el; 

3) minimum cluster distance; 

4) maximum cluster distance; and, 

5) clause complexity • 
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Various other measures may be incorporated in the future. 
' The basic idea of the deduction strategy is to generate (or to 
bring into an active status) clauses of best merit first, then next best 
merit, etc. The methods for generation used are inclusion of a clause 
from the set of base clauses, factoring, resolution, and paramodulation. 
Only clauses that have been generated (i.e., placed in an A-set) may 
interact to generate new clauses. An attempt is made to estimate or 
predict the merit of a resolveu': or paramodulant from the merit of its 
two parents without explicitly forming the resultant clause. We 
therefore interact those clauses that we predict will yield the best 
merit successors. This is necessary since in general, we cannot give an 
exact formula for the merit of a resolvent or paramodulart but only an 
upper bound. If only length and level were used, however, an exact 
value would be known. IVhenever a clause is generated by any means, its 
merit M is calculated (rather than predicted) and it is placed in a 
merit set designated by A(f-|L,f2, - • ,fj^) or A(M) . In the following 
sections, these ideas will be explained in more detail and the three 
major subroutines of the algorithm will be described along with any 
ua.fferences they have with respect to the I algorithm. 

4.5.: S ubroutine NEXTMERIT 

Since clauses are to be generated in increasing upper diagonal 
merit order, a routine ^}EXT^^ERIT is used to enumerate the merit compon- 
ents £^ in the order that the corresponding A- sets are to be filled. 
The sequence of merits generated is of course dependent upon whether 
the user has chosen Ordering-A or Order ing-B (Equations 4.9 and 4.11, 
respectively) . In the current version of MRPPS, it is assumed that all 
features take on values o£ 0, 1, 2, 3, .•.,n where h is a positive 
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integer. Also, only a linear transformation o£ the form t : ^ e"* 

m 

is allowed, m = 1,««.,5 , and W is the identity matrix. Thus, the 

A 

components of F are actually equal to the corresponding components of 
?JE)CIT1ERIT is called whenever a new A-set is to be filled by the 

A 

FILL subroutine, at which time it generates the next merit vector F 
with respect to the ordering ^^u chosen. FILL then attempts to form 
clauses o£ merit M = ^u.F. As an example o£ possible sequences gen- 
erated during a proof, consider the enumerations given in Figure 4. 

A A 

Assume for Ordering-B that and f 2 are "h-type" parameters and 

A 

the fj is a "g-type" parameter. 

4.5.3 Subroutine FILL(M) 

At the start of a proof, there exist no A- sets. Wien the query is 
input by the user on the teletype, the merit of each of the n clauses 
in the negation of the query is calculated, header cells for the 
corresponding A- sets are created, and the clauses are placed in the 
correct sets. If M^^ = min^u {M.|M^ is the merit of C 6 '\>Q , 
i = 1 n} then a call to FILL(Mq) is made. 

Each time FILL 04) is called, a check is made to see if any 
clauses are already in the A-set being filled. For instance, clauses 
from the negation of the theorem are found in this manner. If a clause 
is found, pointers to it and its merit are placed on the list USPECS 
(refer back to Section 4.4.2) and the clause is passed to REQIRSE. If 
no clause is found, FILL attempts to find a base clause of merit 
M' ^^u M by calling subroutine BASEC. As described earlier, BASEC 
processes the entries on USPECS, creates entries on SPECLIST and uses 
the literals on the SPECLIST to determine what clause to select from 
the data base. If BASEC returns a clause C of merit Cfl , pointers 
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FIGURE 4. ENUMERATIONS OF MERIT VECTORS FOR ORDERING-A AND 
ORDERING-B AND t,: E^ E^ 
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to C and CM are placed in USPECS and REailSECC) is called. 

If BASEC fails to find any^ suitable clause, resolution is attempted 
between clauses already generated. FILL tries to predict the merit of 
resolvents by performing calculations upon the merit components of two 
prospective parents. They are prospective since we need not look at in- 
dividual clauses in making these predictions but only at the merits of 
the sets containing the parents. We can thus often eliminate entire 
sets of clauses from consideration as parents without forming any resol- 
vents at all. 

As noted previously, for many types of components it is impossible 
to predict exactly the merit for successors. Instead, we can find 
either lower bounds or upper bounds for the resulting merit. In attemp- 
ting to FILL A 00 , it would be advantageous to be able to guarantee 
that no. successor clause would unexpectedly yield merit >jU M , for 
if this did happen, we would have to save it temporarily on a list and 
for the time being ignore it since we want to generate all those clauses 
of merit :^^u M before clauses with worse merit than M . This can be 
accomplished by delaying the interaction of two clauses € ACM^) and 
C2 € k^2^ until the upper bound for the merit of their successors 
equals M , the merit of the set being filled. That is, after explicit 
calculation of a resolvent, we may discover its merit to be better than 
or equal to M , but never worse. 

it 

Note that this is a departure from the technique used in the \ 
algorithm, since when only length and level are used, the merit of re- 
solvents are known exactly. Another difference is that in the Q* 
algorithm, and/or M2 may ec[ual M rather than being strictly 
better than M as is the case in ^ .As an example, consider an 
evaluation function whose sole component is clause length. The merit 
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of a resolvent of of length = 5.^ and C2 of length M2 = 
^2 is ^"iL ^2 " ^ • If M = 2 , then a possible combination of 
merits is = M2 = M = 2 . Thus, if no base clause is found with 
merit M , resolution is attempted between clauses previously generated 
and whose merit is better than or eqml to M . 

IVhen no more clauses can be resolved, paramodulation may be attemp- 
ted if the user selected paramodulation when the search was initiated. 
This process is very similar to that just described for resolution. In 
either case, if FILL forms a clause C , its merit CM is calculated, 
pointers to C and CM are placed on USPECS as described before, and 
RECURSE(C) is called. 

As implied above, it is necessary to know the upper bounds be- 
tween pairs of feature values. The following table gives the upper 
bounds between parameter f-|^ of and f2 of C2 for resolution 
and paranodulation as used by both FILL and RFHRSE. 



parameter resolution paramodulation 

length f^ + f 2 - 2 f^ + f g - 1 

maximum cluster max(f^,f2) max(f^,f2) 

minimum cluster max 6 max 6 

functional complexity f^ + f 2 f + f2 

level max(f^,f2) + 1 max(f^,f2) + 1 



Here max 6 is the maximum finite graph theoretical distance in the 
semantic graph for the data base. 

Hie complete vector for an upper bound between clauses and 
C2 is simply the vector whose components u^ are defined as 
Uj = upper bound(fj^^,f2^) for feature f^^^ of and f2^ of C, 
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4.5.4 Subroutine RECIJRSECC) 

During the following discussion o£ RECURSE(C) assume that M is 
the merit of the A- set currently being- filled, C is the clause being 
recursed upon, and CM is its merit. RECURSE attempts to either (1) 
select appropriate clauses from the data base or to generate all 
possible successors of clause C by (2) factoring C , by (3) resolving 
C with a previously generated clause, or by (4) paramodulating C with 
another clause. In general, any clause generated by (1) must have merit 
N <^u CM whereas those generated by (2), (3) or (4) must have merit 
N ^^u M . Thus, RECURSE proceeds in four stages as described below. 

Tlie first stage determines whether there are data base clauses of 
i?.erit N <^u CM . BASEC is called and the literals on SPECLIST are re- 
ferred to for guidance in selecting the "best'' clause to enter into an 
A- set. This operation is done during RECURSE (rather than only in 
FILL as is the case in J ) i^i order to interact ])ase clauses as soon 
as possible after the literals permitting such an interaction are 
placed on SPECLIST. 

If a clause D is found by MSEC, the merit 1>I of D is compu- 
ted, D is placed in an A-set, pointers to C and 04 are saved on a 
stack, pointers to D and DM are placed on USPECS, and RECURSE is 
called on D . We have therefore decided to delay recurs ing on C 
temporarily in favor of D since this seems more promising at the mo- 
ment. 

Note that this operation is not permitted in the I algorithm 
which requires that base clauses may only be entered into an A-set 
during FILL but not during RECURSE. However, between the tijne a clause 
C is generated in FILL, is recursed upon, and control returns to FILL 



69 

again, many literals may have been placed on SPECLIST because o£ the 
inferences generated during this time. Thus in I , these literals are 
ignored until FILL is re-entered, and we have therefore unnecessarily 
delayed the interaction of data base clauses with clauses already gen- 
erated. We do not know how much efficiency will be gained by calling BASEC 
during RHCURSE and it is thus available as an option to the experimenter. 

If no more clauses can be generated by Stage 1 of RECURSE, Stage 2 
attempts to factor C . Factoring is always performed without any 
reference to merits since normally, a factor has better merit than its 
parent. If, ho\cever, a factor D of merit >^u M is formed, RECURSECD) 
is not called but deferred until a future call to FILL. Else, if a 
factor D is formed using literals ^ C and ^2 ^ ^ > merit IM 
of D is calculated, D is entered into the correct A-set, pointers 
to C , y ^2 ^ stacked, pointers to D and JM are 

placed on USPECS, and RECURSE(D) is called. If no more factors can 
be formed. Stage 3 is entered. 

Stage 3 finds all resolvents of C with clauses C such that 
Upperbound (Merit (Resolvent (C,CO)) ^^f^ M • When such a clause D is 
formed using literals ^ C and ^2 ^ ' merit DM is calcu- 
lated, it is placed in the correct A-set, pointers to D and DM are 
placed on USPECS, pointers to C , , , ^2 CM are stacked 

and RECURSE (D) is called. 

If the above upperbound is N >^u M , the resolvent is not gener- 
ated since a future call to FILL(N) will generate the clause. Recall 
that RECURSE of I formed resolvents of merit N <^u M . This was 
because when only length and level are used as components of the evalu- 
ation function f , it is impossible to form a clause of merit equal to 
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M , whereas in Q* it is quite possible (for example, consider only- 
length as a parameter) . 

Vflien Stage 3 fails to find any more resolvents with C , para- 
modulation may be attempted in Stage 4, if desired. This process is 
similar to that of Stage 3 except that when a paramodulant D is formed 
between D and , pointers to C and and the terms from each 
clause used in making the substitutions must be stacked before recursing. 
If all four stages cannot generate any more clauses, the stack is 
popped and control is returned to the previous level of RECURSE which 
continues to find more base clauses or to find more successors to C , . 
depending on what stage has resumed execution. 

Although selecting base clauses during RECURSE may help optimize 
the order of clause generation, it also can cause the generation of 
duplicate clauses. For instance, let clause C be stacked because 
clause C* has been found by MSEC when RECURSE (C) is called. Tlien 

RECURSE (C*) may cause D = Resolvent (C*,C) to be formed, \ihen C is 

I 

later unstacked and recursed upon, the same clause D = Resolvent (C,C*) 
may again be formed. This duplication can be avoided either by checking 
for alphabetic variants after foming inferences or by some bookkeeping 
procedure. Since alphabetic variants arc normally eliminated, Q* uses 
this method. 

4.5.5 Halting Conditions for the Deduction Strategy 

It should be noted that under certain conditions, the deduction 
strategy can detect that no more inferences can be produced. If ao null 
clause has been produced up to that point, the search may be halted and 
the query may be considered unanswerable or false. This condition may be 
detected as follows. At any given point of a proof, W points to the 
worst merit non-empty A-set. The merit of the worst merit resolvent (or 
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paramodulant) between C^^ 6 AQM) and C2 € A(M) where 
M WNl is calculated and is pointed to by . Any time a clause 
of merit N > WM is formed, WM ■♦■N and is recalculated. If 
no null clause has been found after FILL(M^) is complete, the search 
is halted because no more inferences can be formed. 

4.6 The Q* Search Algorithm - The Search Strategy of MRPPS 

4.6.1 Introduction 

This description of the Q* algorithm ties together the deduction 
strategy and the base clause selection strategy of MRPPS. Details cover- 
ing SPECLIST entries vere discussed in Section 4.4. In particular, keep 
in mind that each time a base clause is selected, the SPECLIST is re- 
ordered, spec literals are removed if necessary, a new spec axiom is 
found, and the spec upperbound merit is recalculated. Also, the follow- 
ing algorithm does not include provisions for paramodulation since the 
control mechansim is identical to resolution. The algorithm onploys a 
pointer variable n to point at the current clause. The vectors M , 
N , l\M , M* , and PM store merit values. The notation PM ^ means 
that all components of vector PM are set to 0 . Merit M^^ is 
better or equal to merit M2 , denoted Mj^ ^ M2 , according to the merit 
order ings described in Section 4.2.1. The notation n FlLTER(n) 
means that clause n is passed to a "filtering" routine that eliminates 
tautologies, subsumed clauses, and alphabetic variants. BASER = 1 means 
that BASEC is called during RECURSE. With these conments in mind, the 
algorithm will now be given. 

4.6.2 The Q* Algorithm 

[1] [Initialize]. Set PM ^ ^ , WM ^ ^ , ^ ^ , FILLMODE ^ 1 , 
STACKl ^ A , STACK2 ^ A , and STACKS ^ A . 
Q [2] [Enter Query Clauses] 

ERIC 



72 



[2.1] [Test For Constants In Query]. If a constant appears 
in a clause of {Q} , set CONST f- 1 , otherwise set 
CONST 0 . 

[2.2] [Enter Query Clauses Into Merit Sets]. Enter each 

clause of {Q} into an appropriate merit set. If the 
null clause is in {Q} , print message to this effect 
and stop, otherwise, set M to the merit of the best 
merit clause in {Q} , and set FILLM3DE -c 1 . 

[Fill Merit Set A(M) ] . If FILLMDDE = 1 go to [3.1], 
otherwise, if FILLMDDE - 2 go to [3.2], otherwise go to 
[3.3.3]. 

[3.1] [Fill By Scanning Merit Set]. If there are no clauses 
in A(M) to which RECURSE has not been applied, set 
FILLNDDE 2 , and go to [3.2]; othenvise, let n 
point at the first such clause, mark the clause as 
having been explored by RECURSE, set N -f- M , and go 
to [5]. 

[3.2] [Fill With Base Clause]. If CONST =1 , go to [3.2.1], 
otherwise go to [3.2.2]. 

[3.2.1] [Constants Are Turned On]. If a constant 
occurs in the first spec literal and the 
associated spec axiom has merit = M , con- 
tinue; otherwise, go to [3.2.1.1]. Let n 
point at the spec axiom, set N -f- M , remove 
axiom n from the SPECLIST entry and enter it 
into merit set A(N) . Calculate the predicted 
merit, m , of a resolvent of two clauses whose 
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merits are equal to the merit o£ the spec 
clause and the merit o£ the spec axiom, re- 
spectively. I£ PM < mp , set PM ^ mp . Go 
to [5]. 

[3.2.1.1] If a constant occurs in the £irst 
spec literal and the associated 
spec axiom has merit > M and 
M > PM , continue; otherwise go to 
[3.3]. Is there a SPECLIST entr)' 
whose spec literal contains no con- 
stants? I£ not, go to [3.3]. 
Otherwise, i£ the spec axiom o£ the 
£irst such entry has merit ^ M , 
let n point at this axiom, set 
N ^ MERIT (n) , remove clause n 
£rom the SPECLIST entry, enter n 
into ACN) , and go to [5]. 

[3.2.2] [Constants Are Turned 0££]. I£ the spec 

axian o£ any SPECLIST entry has merit ^ M , 
call the clause n , set N ^ MERIT (n) , re- 
move clause n £rom the SPECLIST entry, enter 
n into merit set A(N) , and go to [5]. 

[3.3] [Fill By Resolution]. Set FILL^«DE ^ 3 . 

[3.3.1] [Find Merit Sets]. Find the next pair o£ 
merit sets A(Mj^), A(M2) such that 
Upperbound(Mj^,M2) = M . I£ no such pair was 
£ound and M' > M , go to [4] ; else i£ no pair 
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was found and M' = M then stop and declare 
the query to be unanswerable; otherwise go to 
[3«3« 2] • 

[3.3.2] [Find Clauses in A(Ai^) and AQ^l^)]. Let 
and C2 be the next pair o£ clauses in 
A(H-^) and {^0^2) respectively • I£ no such 
pair was gound go to [3. 3*1]; otherwise go to 
[3.3^3]^ 

[3,3,3] [Find Resolvents of: and C^]. Let n 

point to the next resolvent of C-|^ and C2 . 
If no such resolvent can be found, go to 
[3,3. 2]; otherwise, set n ^ FILTER (n) . If 
n should be eliminated, go to [3.3,2] . Else, 
let N = MERIT(n) , enter clause n into 
merit set A(N) , and go to [5], 

[4] [Find Next Merit Set] • Set M ^ NEXWIERITCM) , FILLMODE ^ 1 , 
and go to [3], 

[5] [Create SPECLIST Entries]. For each literal in clause n , 
as permitted by the inference system in force, create a 
SPECLF^T ontry as described in Section 4.4. 

[6] [Test l or Null Clause] . If N > \m , M ^ N , and recalculate 
to be the merit of the worst merit resolvent between 
C^ 6 AO\M) and 6 AQi) such that M ^ IVM . Else, if 
clause n is the null clause, print a message that a refuta- 
tion has been found, and stop^ 

[6.2] [Test For Axioms On Merit < N]. If CONST =0 or if 
no constant occurs in the first spec literal, 
Q or if the user has set BASER ^ 0, go to 
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[6.3]. I£ the first spec axican has merit N , go to 
[6.3]; otherwise, let n^^ point at the Suec axiom and 
remove the axiom from the SPECLIST. Calculate the pre- 
dicted merit, , of a resolvent of two clauses 
whose merit are equal to those of the spec clause and 
the spec axim, n^^ , respectively. If PM < m^ , set 
PM ^m^ . Set OP <- ''Axiom" , and go to [6.4]. 

[6.3] [Infer Clause of Merit ^M]. 

[6.3.1] [From Factors]. Is there another factor n^^ 
of clause n such that MERIT ^3 ^ M ? If 
not, go to [6.3.2]; otherwise, set 

<- FILTER(n^) . If n^^ Should be elimina- 
ted, go to [6.3.1]. Else, OP ^ "Factor" , and 
go to [6.4]. 

[6.3.2] [Find Merit Set]. Find the next merit set, 
ACM*') such that UpperboundCN,M") ^ M . If 
no such set was found, go to [6.5]; othendse, 
go to [6.3.2.1]. 

[6.3.2.1] [Find Next Clause] . Let C^^ be 
the next clause in A(M") . If 
no such clause is found, go to 
[6.3.2]; otherwise, go to [6.3.2.2]. 

[6.3.2.2] [Form Resolvents] . Let n^^ point 
to the next resolvent of n and 
C-j^ . If no such resolvent can be 
formed, go to [6.3.2.1]; otherwise. 
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set ^ FILTER(n^) . If 
should be eliminated, go to 
[6.3.2.1]. Else, set 
OP ^ "Resolvent" and go to [6.4]. 

[6.4] [Stack The Current Clause]. STACKl ^ n , STACK2 ^ N , 
bTACO ^ OP , set N ^ MERIT(n^) , enter clause n^ 
into A(N) , set n ^ n^ , and go to [5]. 

[6.5] [Pop The Stack]. If STACKl = A , go to [3]; otherwise, 
n ^ STACKl , N ^ STACK2 , and OP ^ STACK3 . If 
OP = "Axfian" , go to [6.2]; otherwise, if OP = 
"Factor" , go to [6.3.1]; otherwise, go to [6.5.2.2]. 

4.6.3 Completenes:>, Admissiblity , ^d Optjjnality of 

Search algorithms are said to be complete if all nodes in the search 
space will be visited (or generatcaj at some stage of the search. Such 
search algorithms might be s?-iH to be exhaustive. In terms of a theorem 
proving problan, a c^-mplete search strategy wil] generate, at some stage 
of the search, every clause Z such that C is either in, or is deduci- 
ble from a starting set rf clauses (where deductions are made using some 
given, fixed set of inference rules). 

The 0* algorithm is not complete in the above sense, since in 
general^ the ali^orithm will fail to generate portions of the search 
space. In particular, the Q* algorithms will not generate axioms 
whose predicates do not occur with query predicates iu a connected com- 
ponent of the semantic graph . 

The fact that the Q* algorithm, in general, does not exhausti\^ely 
generate the entire searcn space is not a negative aspect of the 



77 



algorithm. Indeed it is a positive aspect. However, we must assure 
ourselves that the Q* algorithm is guaranteed to generate the null 
clause whenever the null clause is a node in the search space (i.e., 
whenever it would be generated by a complete search strategy) . I£ such 
is the case, we will call the search strategy refutation complete. 
Using a refutation complete strategy for a question answering system 
assures us, at least theoretically, that if a question can be answered 
positively, then the search algorithm will find the answer. 

The method of making axioms candidates for generation selects 
those axioms which contain a literal which will unify (ignoring literal 
signs) with a literal of a generated clause. Initially, the query lit- 
erals are vsed for this purpose. Subsequently, t?ie literals of inferred 
clauses, and, when the inference system is not employing set -of -support, 
the literals of generated axioms will be used for making axioms candi- 
dates for generation. Thus, the only axioms which will ever become can- 
didates are those which are a finite cluster distance from the query (see 
Section 4.3.4). Any other axioms cannot logically interact with the 
candidate axioms or the query clauses, or successors of these clausi^s, 
and hence, could never take part in a refutation since the set of axioms 
is assumed to be consistent. If a refutation required the use of a 
lemma, then the literals of the lemma would descend from possibly more 
general instances '.f these literals in the ancestor axioms. Hence these 
axioms muld become candidates. 

It is evident that any axioms required in a refutation will become 
candidates for generation at some stage of the search. The remaining 
question, then, is whether all of the candidates can be generated. 
The base clause algorithm gives preference to those axioms which have 
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become candidates because o£ generated clause literals containing con- 
stants • Since there are only a finite number o£ axioms in the starting 
set, these will eventually be exhausted, and hence, at some stage the 
other candidate* axioms will be generated. Since the deductive search 
strategy is exhaustive, and since all of the axioms which could possibly 
be of use will eventually be turned over to that strategy, the Q* 
algoritlim is refutation conplete. 

A search algorithm is said to be admissible if it is guaranteed to 
find the shortest path to a goal node. In the case of a search algorithm 
for the theorem proving problem, an admissible algorithm would be guaran- 
teed to find a shortest proof (refutation). The 0* algorithm is not 
an admissible algorithn for theorem proving. The algorithm may overlook 
a shortest proof because in order to find that proof it would have to 
generate axioms to interact \At\i general literals in the negation of the 
theorem (queryl before it generated axians to interact with more specific 
literals. For example, suppose that the set S of axioms contains the 
clauses, 

S = {P(x)R(x), R(x)f(x), Q(x)?(x), T(a), S(b)} 

and that the negation of the theorem contains the two clauses, 

i'^) = {P(a), Q(x)} . 

The Q* algorithm wuld first generate P(a) and Q(x) . IVhen these 
are generated, entries would be made on the SPRCLIST for the literals 
P(a) and Q(x) . The candidate axioms for P(a) would precede those 
for Q(x) . Thus, the first axior which would be generated is P(x)R(x) . 
As soon as it is generated, it would be used in an inference, resulting 
in the generation of the clause R(a) . The SPECLIST entry created for 
this literal would again precede that for Q(x) , and so the axioms 
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R(x)T(a) which would, in turn, be used to enable the axiom T(a) to 
* be generated, and immediately thereafter, the null clause would be gener- 
ated. Thus, the Q* algorithm will find the refutation 




However, a shorter proof is: 




Although the Q* algorithm fails to be admissible, v/e believe that this 
will not be a significant disadvantage of the algorithm. If the heuris- 
tics prove powerful enough to bring this sort of deductive power to bear 
on a large enough class of QA problems in a practical setting, then in- 
deed, this will be a small price to pay. 
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An admissible search algorithm is said to be optimal if there is no 
other "comparable" admissible search algorithm which generates fewer 
nodes. Since the Q* algorithm is not admissible, there is nothing to 
be said about its optimality. 
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5. Conclusions and Future Directions 

The main direction o£ this research is to explore various deductive 
search mechanisms for question answering systems. The research has 
been restricted to this scope primarily because o£ limited funding. 
It is. hoped that in the future we may extend the v/ork to develop a full 
QA System. Experiments could then be conducted with all aspects 
of such systems rather than only the deductive mechanism. In 
particular, it will be important in the near future to consider large 
data bases. However, for the present time, we plan to emphasize 
experimental work with the current system, investigating small data 
bases and incorporating modifications as necessary. 

Tliere are several modifications to the existing system that are 
in progress. First, the inference mechanism is being expanded to include 
A-ordering (Darlington [1969], Kowalski, Hayes [1969], Slagle [1967]). 
Darlington [1969] has speculated that A-Ordering is a promising refine- 
ment of resolution for QA Systems , although this fact must be supported 
by more experimentation than currently exists in the literature. In 
addition, since paramodulation is available to the user as an option, 
experimentation will be performed with it in conjunction with the Q* 
search strategy. We would like to discover whether the heuristics 
currently available are sufficient to allow paramodulation to be 
used efficiently. 

In the area of search strategies MRPPS currently allows the user 
to select 

1) the type of generalized merit ordering to be used (i.e. Ordering-A 
or Ordering- B), 

2) which parameters to treat as "g-type" and which to treat as 
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"h-type", 

and 3) which clause feature to assign to each component o£ the feature 
vector F. 

It is planned that the Q* algorithm will be modified so that weight 
matrices other than an identity matrix wilL^e allowed as will as 
an ordering corresponding to that of Pohl (e^ht^OTi 4.14). We presently 
do not know i\Jiether the generalized merit ordering wi^ be siperior 
than an evaluation function f(n) = g(n) + h(n) withf only two com- 
ponents. Additional experiments are required in thi^regard. In 
addition, new combinations of heuristics need to be devis^ for the 
search strategy. 

We believe that the base clause selection strategy is crucial to 
a practical QA system and it is in this component that semantics can 
be utilized most effectively. Various semantic considerations have 
been developed and will be incorporated into the system, although all 
of these ideas have not been described in this report. 

Experimentation will be performed using the current data base 
that consists of genealogical data about Eskimos. Typical data might be 
a certain Eskino's age, name, husband or wife and the names of his or 
her children. Since this data base is somewhat limited with respect to 
the type of questions that can be asked, it is planned' that other large 
data bases will be developed that are stored on auxiliary storage 
rather than in core. This would be one step towards our long-range 
goal of inplementing a large-scale question-answering system of practical 
utility. However, this is not within the range of the current funding. 

Some promising results have already been obtained using the Eskimo 
data base. In fact, using only length and level as heuristics, fairly 
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deep proofs have been o]>tained efficiently with a data base of 300 unit 
data clauses and 85 general axioms. In particular, questions such as 
•Hvho is Joe's mother's motherinlaw?" that require six inference steps 
involving the use of general axioms., have been answered in about one second. 
Questions requiring only data facts in order to be answered have 
required about .1 second. It is hoped that these times can be shortened 
in future versions of MRPPS and that similar results may be obtained 
in the future using a much larger (and more realistic) data base. 

Several steps have been taken towards developing an entire QA 
System that have not been described in this text. The user can enter 
his own data base provided it fits in the available space and is 
expressible as a set of clauses. In addition, an algorithm to translate 
v;ff*s in first-order predicate calculus into clause foim has been 
inplemented and integrated with the overall system, as has an answer- 
extraction algorithm based on the work of Luckham and Nilsson [1970] . 

Although we have outlined some of the plans we have conceming 
futui'e directions for research in QA Systems at Maryland, there are sev- 
eral conclusions that may be drawn from the current work. MRPPS is 
designed so that the data base, the inference mechanisms and the search 
strategy can be regarded as separate but interactii^.g entities. We 
believe that the inference mechanisms may be regarded as primitive 
routines (or operators) that logically deduce new clauses. They should 
in no way be considered search strategies that select the •'most appro- 
priate" inference to make. iMs latter decision must be made by a high- 
level control mechanism that references extensive semantic information. 

As a step towards building such a control mechanism, we have 
utilized some semantic information about the data base by the use of 
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cluster heuristics in an evaluation function £(n) = g(n) + h(n) and 
other heuristics used in the base clause selection strategy. (This 
is in contrast to the purely syntactic considerations o£ the inference 
mechanism.) Some preliminary experiments indicate that both o£ these 
tecliniques are promising. At the same time, it seems clear that much 
more semantic information is needed in order to enable efficient 
answering or questions. This should be in the form of advice to the 
search strategy about which derivation paths are most likely to 
succeed or which will probably never succeed. This advice could be 
the result of human insight about the problem to be solved (such as 
in PLANNER, Hewitt [1970]) or could be generated by the system based 
upon past experience with analogous problems. 

One way that this could be acconplished is to store proof 
schemata of theorems that have been previously proven. Each schema 
would consist of several possible derivation paths, each specifying 
(among other things) recommended axiori;s and their corresponding weights 
indicating the relative probability that each axiom has of aiding in the 
search. These schemata would be referenced continuously during a proof 
and would direct the proof to a large extent. Some limited capabilities 
currently exist in the system which are able to do this. However, we 
ivould like to try to achieve a more sophisticated mechanism and atten^jt 
to interface it with the Q* algorithm. In any event, some more 
"informed" type of planning mechanism is certainly necessary and this 
protlsn is currently being addressed. 

In general, MRPPS provides us with a flexible interactive system 
in which the parameters may be varied by the user.; Hopefully, we will be 
able to gain insight into: 
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(a) inference mechansims, 

(b) heuristic measures, 

(c) and semantic considerations 

to help guide a QA System in answering questions. Work with a data 
base should permit experiments to be conducted and reported iq^on 
in subsequent reports. 
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